Ethan Mollick: To be clear, AI is not the root cause of cheating. Cheating happens because schoolwork is hard and high stakes. And schoolwork is hard and high stakes because learning is not always fun and forms of extrinsic motivation, like grades, are often required to get people to learn. People are exquisitely good at figuring out ways to avoid things they don’t like to do, and, as a major new analysis shows, most people don’t like mental effort. So, they delegate some of that effort to the AI.
This also characterizes quite a few areas like becoming healthier, or losing weight, or exercising more, because unfortunately getting healthier, losing weight, or exercising more both requires a lot of effort to both do and maintain, and doing those things is unfortunately way less fun and easy than other options.
Here, there are definitely tools to make it a little better, but I'd still say that this is a big reason why Americans are quite unhealthy today.
Roon, in a distinct thread, reminds us that humans are very good at some things relative to other things, that AIs will instead be relatively good at different things, and we should not expect AGI in the sense of ‘better than all humans at actual everything’ until well after it is a ton better than us at many important things.
The key point Eliezer is trying to make is that, while intelligence is weird and will advance relatively far in different places in unpredictable ways, at some point none of that matters. There is a real sense in which ‘smart enough to figure the remaining things out’ is a universal threshold, in both AIs and humans. A sufficiently generally smart human, or a sufficiently capable AI, can and will figure out pretty much anything, up to some level of general difficulty relative to time available, if they put their mind to doing that.
When people say ‘ASI couldn’t do [X]’ they are either making a physics claim about [X] not being possible, or they are wrong. There is no third option. Instead, people make claims like ‘ASI won’t be able to do [X]’ and then pre-AGI models are very much sufficient to do [X].
While people are often wrong about when AI will do X, especially relative to another task Y, I think there's another reading of Roon's tweet thread that is also valuable to inject into LW discourse, and it's that @So8res and @Eliezer Yudkowsky and MIRI were pretty wrong about there being a core of general intelligence that is primarily algorithmic that humans have and no other species has.
While g as a construct does work for general intelligence, it's way less powerful as an explanation than Nate Soares and Eliezer Yudkowsky and MIRI thought.
Roon's tweet thread is about how even in AI takeoff, AIs will still have real weaknesses, as well as areas where AIs are worse than humans at some tasks.
Also, this:
at some point none of that matters. There is a real sense in which ‘smart enough to figure the remaining things out’ is a universal threshold, in both AIs and humans. A sufficiently generally smart human, or a sufficiently capable AI, can and will figure out pretty much anything, up to some level of general difficulty relative to time available, if they put their mind to doing that.
Even if this happens, it will still take quite a lot of time, on the order of 1-3 decades at least after AI replaces humans at lots of jobs, and thus the time period where AIs both are smarter than humans in some very important areas but aren't universally better than humans matters a lot in the takeoff.
So Roon's thread is mostly about how there's no real core of intelligence in both humans and AIs, and how AI and human capabilities will absolutely vary a lot, even in takeoff scenarios.
This BTW is why I hate the AGI concept, since it's way too ill-defined and ultimately looks like a grab-bag of things humans have and AIs don't have, and we need to start thinking more quantitatively on AI progress.
Tyler Cowen says these are the kinds of problems that should be solved within a year.
You don't solve issues like this (especially with a fixed model-size budget). You fine-tune the rate down to better than user expectations, and/or decrease user expectations to an achievable rate.
When people say ‘ASI couldn’t do [X]’ they are either making a physics claim about [X] not being possible, or they are wrong.
There are tasks whose algorithmic complexity class and size is such that while they're not physically impossible, they can't practically be solved (or in some cases even well approximated) in the lifetime of the universe. However, any complexity theorist will tell you we're currently really bad at identifying and proving specific instances of this, so I wouldn't place bets on those. And yes, anything evolution has produced a good approximation to clearly doesn't fall in this class.
Following up on Alpha Fold, DeepMind has moved on to Alpha Proteo. We also got a rather simple prompt that can create a remarkably not-bad superforecaster for at least some classes of medium term events.
We did not get a new best open model, because that turned out to be a scam. And we don’t have Apple Intelligence, because it isn’t ready for prime time. We also got only one very brief mention of AI in the debate I felt compelled to watch.
What about all the apps out there, that we haven’t even tried? It’s always weird to get lists of ‘top 50 AI websites and apps’ and notice you haven’t even heard of most of them.
Table of Contents
Language Models Offer Mundane Utility
ChatGPT has 200 million active users. Meta AI claims 400m monthly active users and 185m weekly actives across their products. Meta has tons of people already using their products, and I strongly suspect a lot of those users are incidental or even accidental. Also note that less than half of monthly users use the product monthly! That’s a huge drop off for such a useful product.
Undermine, or improve by decreasing costs?
I worry it will be that second one. The problem is demand side, not supply side.
Models get better at helping humans with translating if you throw more compute at them, economists think this is a useful paper.
Alex Tabarrok cites the latest paper on AI ‘creativity,’ saying obviously LLMs are creative reasoners, unless we ‘rule it out by definition.’ Ethan Mollick has often said similar things. It comes down to whether to use a profoundly ‘uncreative’ definition of creativity, where LLMs shine in what amounts largely to trying new combinations of things and vibing, or to No True Scotsman that and claim ‘real’ creativity is something else beyond that.
One way to interpret Gemini’s capabilities tests is to say it was often able to persuade people of true things but not false things (when instructed to make the case for those false things), whereas humans were about equally effective at persuasion with both true and false claims. Interesting on both ends.
According to a16z these are the top 50 AI Gen AI web products and mobile apps:
ChatGPT is #1 on both, after that the lists are very different, and I am unfamiliar with the majority of both. There’s a huge long tail out there. I suspect some bugs in the algorithm (Microsoft Edge as #2 on Mobile?) but probably most of these are simply things I haven’t thought about at all. Mostly for good reason, occasionally not.
Mobile users have little interest in universal chatbots. Perplexity is at #50, Claude has an app but did not even make the list. If I have time I’m going to try and do some investigations.
Language Models Don’t Offer Mundane Utility
Claude Pro usage limits are indeed lower than we’d like, even with very light usage I’ve run into the cap there multiple times, and at $20/month that shouldn’t happen. It’s vastly more expensive than the API as a way to buy compute. One could of course switch to the API then, if it was urgent, which I’d encourage Simeon here to do.
Sully is disappointed by Claude Sonnet 3.5 for writing, finds GPT-4o is better although Opus is his OG here. David Alexander says it’s because Anthropic used grouped attention to make the model cheaper and quicker.
Most people do not use LLMs or other generative AI for very long each day, as Wilbin is going to get a very with-it sample here and this still happened:
In practice I’m somewhat under 10 minutes per day, but they are a very helpful 10 minutes.
Roon notes that Claude Sonnet 3.5 is great and has not changed, yet people complain it is getting worse. There were some rumors that there were issues with laziness related to the calendar but those should be gone now. Roon’s diagnosis, and I think this is right, is that the novelty wears off, people get used to the ticks and cool stuff, and the parts where it isn’t working quite right stand out more, so we focus on where it is falling short. Also, as a few responses point out, people get lazy in their prompting.
Predictions are Hard Especially About the Future
Dan Hendrycks claims to have built an AI forecaster as well as entire human forecaster teams. Demo here, prompt here.
When you look at the reasoning the AI is using to make the forecasts, it… does not seem like it should result in a superhuman level of prediction. This is not what peak performance looks like. To the extent that it is indeed putting up ‘pretty good’ performance, I would say that is because it is actually ‘doing the work’ to gather basic information before making predictions and avoiding various dumb pitfalls, rather than it actually doing something super impressive.
But of course, that is sufficient exactly because humans often don’t get the job done, including humans on sites like Metaculus (or Manifold, or even Polymarket).
Robin Hanson actively said he’d bet against this result replicating.
Dan Hendrycks suspects it’s all cope.
I think the usual meaning when people say this is close to Pareto, although not as strict. It doesn’t have to be better in every sense, but it does have to be clearly superior ignoring cost considerations, and including handling edge cases and not looking like an idiot, rather than only being superior on some average.
There were also process objections, including from Lumpenspace and Danny Halawi, more at the links. Dan Hendrycks ran additional tests and reports he is confident that there was not data contamination involved. He has every incentive here to play it straight, and nothing to win by playing it any other way given how many EA-style skeptical eyes are inevitably going to be on any result like this. Indeed, a previous paper by Halawi shows similar promise in getting good LLM predictions.
He does note that for near-term predictions like Polymarket markets the system does relatively worse. That makes logical sense. As with all things AI, you have to use it where it is strong.
Early Apple Intelligence
Apple Intelligence is, according to Geoffrey Fowler of WaPo who has beta access, very much not ready for prime time. He reports 5-10 ‘laugh out loud’ moments per day, including making him bald in a photo, saying Trump endorsed Walz, and putting obvious social security scams atop his ‘priority’ inbox.
Tyler Cowen says these are the kinds of problems that should be solved within a year. The key question is whether he is right about that. Are these fixable bugs in a beta system, or are they fundamental problems that will be hard to solve? What will happen when the problems become anti-inductive, with those composing emails and notifications pre-testing for how Apple Intelligence will react? It’s going to be weird.
Marques Brownlee gives first impressions for the iPhone 16 and other announced products. Meet the new phone, same as the old phone, although they mentioned an always welcome larger battery. And two new physical buttons, I always love me some buttons. Yes, also Apple Intelligence, but that’s not actually available yet, so he’s reserving judgment on that until he gets to try it.
Indeed, if you watch the Apple announcement, they kind of bury the Apple Intelligence pitch a bit, it only lasts a few minutes and does not even have a labeled section. They are doubling down on small, very practical tasks. The parts where you can ask it to do something, but only happen if you ask, seem great. The parts where they do things automatically, like summarizing and sorting notifications? That seems scarier if it falls short.
Swyx clips the five minutes that did discuss AI, and is optimistic about the execution and use cases: Summaries in notifications, camera controls, Siri actually working and so on.
My very early report from my Pixel 9 is that there are some cool new features around the edges, but it’s hard to tell how much integration is available or how good the core features are until things come up organically. I do know that Gemini does not have access to settings. I do know that even something as small as integrated universal automatic transcription is a potential big practical deal.
Ben Thompson goes over the full announcement from the business side, and thinks it all makes sense, with no price increase reflecting that the upgrades are tiny aside from the future Apple Intelligence, and the goal of making the AI accessible on the low end as quickly as possible.
On Reflection It’s a Scam
Some bold claims were made.
As they say, huge if true.
Eliezer asks the good question, if Llama 3.1 fine tunes are so awesome, where are all the excited users?
It all sounds too good to be true. Which means it probably is, and we knew that before we got the confirmation.
As in, it turns out this was at some points the above, and at others it was Sonnet 3.5 in a shoddily made trenchcoat. Details of this finding here, except then they switched it to some Llama derivative.
There is indeed a pattern of such claims, as Teortaxes points out.
As I always say, wait for the real human users to report back, give it a little time. Also, yes, look for the clear explanations and other costly signals that something is real. There have been some rather bold things that have happened in AI, and there will be more of them, but when they do happen for real the evidence tends to very quickly be unmistakable.
Also note NV-Retriever trained on the test set a while back. Various forms of cheating are reasonably common, and one must be cautious.
Deepfaketown and Botpocalypse Soon
Bot accounts, giving billions of listens to bot songs, to try and get royalty payments out of Spotify. Turns out that’s wire fraud.
Founder of an AI social agent startup used those agents to replace himself on social media and automatically argue for AI agents. I actually think This is Fine in that particular case, also props for ‘ok NIMBY,’ I mean I don’t really know what you were expecting, but in general yeah it’s a problem.
Taylor Swift, in her endorsement of Kamala Harris, cites AI deepfakes that purported to show her endorsing Donald Trump that were posted to Trump’s website. Trump’s previous uses of AI seemed smart, whereas this seems not so smart.
Same as it ever was?
As with the deepfakes and misinformation, is the problem primarily demand side? Perhaps, but the move to zero marginal cost, including for deployment, is a huge deal. And the forces that insist humans generate the human slop are not about to go away. The better hope, if I had to choose one, is that AI can be used to filter out the slop, and allow us to identify the good stuff.
They Took Our Jobs
Replit introduces Replit Agent in early access.
Masad reported it doing all the things, games, resumes, interview problems, etc.
Is this the real deal? Some sources strongly say yes.
One in particular was not yet impressed.
Definitely this is near the top of my ‘tempted to try it out’ list now, if I find the time.
The other question is always, if the AI builds it, can you maintain and improve it?
Well, actually, pretty much everyone doesn’t get to v.0.0.1. Yes, then a lot of people don’t get to v.0.0.2, but from what I see the real biggest barrier is 0.0.1, and to think otherwise is to forget what an outlier it is to get that far.
However, with experiences like Rohit’s the balance shifts. He very clearly now can get to 0.0.1, and the question becomes what happens with the move to 0.0.2 and beyond.
Ethan Mollick discusses ‘post-apocalyptic education’ where the apocalypse is AI.
I would emphasize the role of busywork, of assignments being boring and stupid. It’s true that people dislike mental effort, but they hate pointless effort a lot more. He points out that copying off the internet was already destroying homework before AI.
In practice, if the AI does your homework, it is impossible to detect, except via ‘you obviously can’t do the work’ or ‘you failed the test.’
It’s odd how we think students, even at good schools, are dumb:
They are quite obviously aware of why homework exists in the first place. They simply don’t care. Not enough.
The Time 100 People in AI
Time came out with one of those ‘top 100 people in [X]’ features. Good for clicks.
How good is the list? How good are the descriptions?
If we assume each section is in rank order, shall we say I have questions, such as Sasha Luccioni (head of AI & Climate for Hugging Face?!) over Sam Altman. There are many good picks, and other… questionable picks. I’d say half good picks, the most obvious people are there and the slam dunks are mostly but not entirely there.
Common places they reached for content include creatives and cultural influencers, medical applications and ‘ethical’ concerns.
Counting, I’d say that there are (if you essentially buy that the person is the locally correct person to pick if you’re picking someone, no I will not answer on who is who, and I had a very strict limit to how long I thought about each pick):
(If I’d been picked, I’d probably consider myself a reach.)
This thread, of ‘1 like = 1 person in AI more influential than these chumps,’ is fun.
Yudkowsky, Brockman, Andreessen and Musk seem like very hard names to miss.
I’d also add the trio of Yann LeCun, Geoffrey Hinton and Fei-Fei Li.
Dan Hendrycks and Paul Christiano are missing.
On the policy and government front, I know it’s not what the list is trying to do, but what about Joe Biden, Kamala Harris, Donald Trump or JD Vance? Or for that matter Xi Jinping or other leaders? I also question their pick of US Senator, even if you only get one. And a lot is hinging right now on Gavin Newsom.
There are various others I would pick as well, but they’re not fully obvious.
Even if you give the list its due and understand the need for diversity and exclude world leaders are ‘not the point,’ I think that we can absolutely mock them for missing Yudkowsky, LeCun, Andreessen and Musk, so that’s at best 14/18 very good picks. That would be reasonable if they only got 20 picks. With 100 it’s embarrassing.
The Art of the Jailbreak
Welcome to RedArena.ai, you have one minute to get the model to say the bad word.
Early results are in from the Grey Swan one-shot jailbreaking contest. All but three models have been jailbroken a lot. Gemini 1.5 Pro is the hardest of the standard ones, followed by various Claude variations, GPT-4 and Llama being substantially easier. The three remaining models that remain unbroken (again, in one-shot) are based on circuit breakers and other RepE techniques.
Get Involved
Workshop on Philosophy and AI at Oxford, apply by October 1, event is December 13.
Presidential Innovation Fellows program open through September 30. This is for mid-to-senior career technologists, designers and strategists, who are looking to help make government work technically better. It is based in Washington D.C.
Alpha Proteo
Introducing AlphaProteo, DeepMind’s latest in the Alpha line of highly useful tools. This one designs proteins that successfully bind to target molecules.
These results certainly look impressive, and DeepMind is highly credible in this area.
This continues DeepMind along the path of doing things in biology that we used to be told was an example of what even ASIs would be unable to do, and everyone forgetting those older predictions when much dumber AIs went ahead and did it.
Roon, in a distinct thread, reminds us that humans are very good at some things relative to other things, that AIs will instead be relatively good at different things, and we should not expect AGI in the sense of ‘better than all humans at actual everything’ until well after it is a ton better than us at many important things.
The key point Eliezer is trying to make is that, while intelligence is weird and will advance relatively far in different places in unpredictable ways, at some point none of that matters. There is a real sense in which ‘smart enough to figure the remaining things out’ is a universal threshold, in both AIs and humans. A sufficiently generally smart human, or a sufficiently capable AI, can and will figure out pretty much anything, up to some level of general difficulty relative to time available, if they put their mind to doing that.
When people say ‘ASI couldn’t do [X]’ they are either making a physics claim about [X] not being possible, or they are wrong. There is no third option. Instead, people make claims like ‘ASI won’t be able to do [X]’ and then pre-AGI models are very much sufficient to do [X].
Andrew Critch here confirms that this is all very much a thing.
If the invalidation is actually daily, then the dwindling population to worry about, shall we say, would soon likely not be scientists, mathematicians or those with jobs.
Rest of the thread is Critch once again attempting to warn about his view that AI-AI interactions between competing systems being the biggest future danger, putting loss of control above 80% even though he thinks we will figure out how to understand and control AIs (I hope he’s right that we will figure that one out, but I don’t think we have any reason to be remotely confident there). I think very right that this is a major issue, I try to explain it too.
Critch also asks another good question:
The same goes not only for ‘can’t’ do [X] but even more so for ‘will never’ do [X], especially when it’s ‘even an ASI (superintelligence) could never’ do [X], whether or not humans are already doing it.
Introducing
Google offers waitlist for on-demand AI generated podcasts on papers and books, and offers samples while we wait. Voices are great.
A YC-fueled plan to put the data centers IN SPACE.
Anthropic adds Workspaces to the Anthropic Console, to manage multiple deployments.
In Other AI News
OpenAI valuation set to $150 billion in new raise of $6.5 billion, higher than previously discussed. This is still radically less than the net present value of expected future cash flows from the OpenAI corporation. But that should absolutely be the case, given the myriad ways OpenAI might decide not to pay you and the warning that you consider your investment ‘in the spirit of a donation,’ also that if OpenAI is super profitable than probably we are either all super well off and thus you didn’t much need the profits, or we all have much bigger problems than whether we secured such profits (and again, having shares now is not much assurance that you’ll collect then).
TSMC achieved yields at its new Arizona chip facility it says are on par with home, targeting full production in 2025.
Nvidia denies it got a Department of Justice subpoena.
A very good point: Pay Risk Evaluators in Cash, Not Equity. Those in charge of raising the alarm about downside risks to your product should not have a financial stake in its upside.
Claim that AI research is not that difficult, things like training a transformer from scratch are easy, it’s only that the knowledge involved is specialized. I would say that while I buy that learning ML is easy, there is a huge difference between ‘can learn the basics’ and ‘can usefully do research,’ for example Claude can do one but not yet the other.
Colin Fraser offers skeptical review of the recent paper about LLMs generating novel research ideas.
Lead on the OpenAI ‘Her’ project (his official verdict on success: ‘Maaaaybee…’) has left OpenAI to start his own company.
Credit where credit is due: Marc Andreessen steps up, goes on Manifund and contributes $32k to fully funds ampdot’s Act I, a project exploring emergent behavior from multi-AI, multi-human interactions, 17 minutes after being asked. Janus is involved as well, as are Garret Baker and Matthew Watkins.
Quiet Speculations
Spencer Schiff speculates on frontier model capabilities at the end of 2025, emphasizing that true omni-modality is coming and will be a huge deal, when the image and video and audio generation and processing is fully hooked into the text, and you can have natural feeling conversations. What he does not discuss is how much smarter will those models be underneath all that. Today’s models, even if they fully mastered multi-modality, would not be all that great at the kinds of tasks and use cases he discusses here.
Eliezer Yudkowsky predicts that users who start blindly relying on future LLMs (e.g. GPT-5.5) to chart their paths through life will indeed be treated well by OpenAI and especially Anthropic, although he (correctly, including based on track record) does not say the same for Meta or third party app creators. He registers this now, to remind us that this has nothing at all to do with the ways he thinks AI kills everyone, and what would give reassurance is such techniques working on the first try without a lot of tweaking, whereas ‘works at all’ is great news for people in general but doesn’t count there.
This week’s AI in games headline: Peter Molyneux thinks generative AI is the future of games, all but guaranteeing that it won’t be. Molyneux is originally famous for the great (but probably not worth going back and playing now) 1989 game Populus, and I very much enjoyed the Fable games despite their flaws. His specialty is trying to make games have systems that do things games aren’t ready to do, while often overpromising, which sometimes worked out and sometimes famously didn’t.
To which I say yes, in 25 years I very much expect AI to be able to do this, but that is because in 25 years I expect AI to be able to do pretty much anything, we won’t be worried about whether it makes customized games. Also it is not as hard as it looks to move the next battle royale to a pirate ship, you could almost get that level of customization now, and certainly within 5 years even in AI-fizzle world.
The thing continues to be, why would you want to? Is that desire to have customized details on demand more important than sharing an intentional experience? Would it still feel rewarding? How will we get around the problem where procedurally generated stuff so often feels generic exactly because it is generic? Although of course, with sufficiently capable AI none of the restrictions matter, and the barrier to the ultimate gaming experience is remaining alive to play it.
A reason it is difficult to think well about anything related to defense.
[local idiot discovers the military industrial complex]
Holly Elmore: Or OpenAI
Roon: I accept that this is a valid criticism of most technology press anywhere but fomenting paranoia for various scenarios is the primary way the defense sector makes money rather than some side tactic.
Roon makes an excellent point, but why wouldn’t it apply to Sam Altman, or Marc Andreessen, or anyone else talking about ‘beating China’ in AI? Indeed, didn’t Altman write an editorial that was transparently doing exactly the ‘get trillions in government contracts’ play?
The Quest for Sane Regulations
113+ employees and alums of top-5 AI companies publish open letter supporting SB 1047. Here is the letter’s text:
Jan Leike comes out strongly in favor of SB 1047, pointing out that the law is well-targeted, that similar federal laws are not in the cards, and that if your model causes mass casualties or >$500 million in damages, something has clearly gone very wrong. Posters respond by biting the bullet that no, >$500 million in damages does not mean something has gone wrong. Which seems like some strange use of the word ‘wrong’ that I wasn’t previously aware of, whether or not the developer did anything wrong in that particular case?
SAG-AFTRA (the actors union) endorses SB 1047. So does the National Organization for Women (NOW).
Trump’s position on AI seems loosely held, he is busy talking about other things.
A statement about what you think, or about what is going on in DC?
Hard to say. To the extent DC is ‘awake’ they do not yet seem situationally aware.
Anthropic endorses the AI Advancement and Reliability Act and the Future of AI Innovation Act, both bills recognize the US AI Safety Institute.
The Week in Audio
Anthropic discusses prompt engineering. The central lesson is to actually describe the situation and the task, and put thought into it, and speak to it more like you would to a human than you might think, if you care about a top outcome. Which most of the time you don’t, but occasionally you very much do. If you want consistency for enterprise prompts use lots of examples, for research examples can constrain. Concrete examples in particular risk the model latching onto things in ways you did not intend. And of course, practice practice practice, including makeshift red teaming.
Andrej Karpathy on No Priors.
There was a presidential debate. The term ‘AI’ appeared once, in the form of Kamala Harris talking about the need to ensure American leadership in ‘AI and quantum computing,’ which tells you how seriously they both took the whole thing.
If that keeps up, and the God is indeed born, one might ask: What future generations?
Rhetorical Innovation
An old snippet from 1920, most discussions have not advanced so much since.
Scott Alexander for some reason writes ‘Contra DeBoer on Temporal Copernicanism.’ He points out some of the reasons why ‘humans have been alive for 250,000 years so how dare you think any given new important thing might happen’ is a stupid argument. Sir, we thank you for your service I suppose, but you don’t have to do bother doing this.
A serious problem with no great solutions:
Aligning a Smarter Than Human Intelligence is Difficult
Current mood:
People Are Worried About AI Killing Everyone
Yes, this remains a good question, but the wrong central question, and the optional amount is not zero.
Taleb is not a PauseAI hardliner (as far as I know), because he does not understand or ‘believe in’ AI and especially AGI sufficiently to notice the risk and treat it as real. If he did notice the risk and treat it as real, as something he can imagine happening, then probably yes. Indeed, it is a potential bellwether event when Taleb does so notice. For now, his focus lies in various elsewheres.
The right question is, how do we get the best possible p(heaven), and the lowest possible p(doom), over time?
If we did face a ‘go now or permanently don’t go’ situation, then Roon is asking the right question, also the question of background other p(doom) (and to what extent ordinary aging and other passage of time counts as doom anyway) becomes vital.
If we indeed had only two choices, permanent pause (e.g. let’s say we can modify local spacetime into a different Vinge-style Zone of Thought where AI is impossible) versus going ahead in some fixed way with a fixed chance of doom or heaven, what would the tradeoff be? How good is one versus how bad is the other versus baseline?
I think a wide range of answers are reasonable here. A lot depends on how you are given that choice, and what are your alternatives. Different framings yield very different results.
The actual better question is, what path through causal space maximizes the tradeoff of the two chances. Does slowing down via a particular method, or investing in a certain aspect of the problem, make us more likely to succeed? Does it mean that if we are going to fail and create doom, we might instead not do that, and at least stay in mid world for a while, until we can figure out something better? And so on.
Roon also argues that the existential risk arguments for space colonization are silly, although we should still of course do it anyway because it brings the glory of mankind and a better understanding of the celestial truths. I would add that a lot more humans getting use of a lot more matter means a lot more utility of all kinds, whether or not we will soon face grabby aliens.
Other People Are Not As Worried About AI Killing Everyone
Don’t Panic, but this is the person and company most likely to build the first AGI.
Your ‘we are in the business of strapping rockets to dumpsters in the hopes of then learning how to instead launch nice satellites’ shirt is raising questions supposedly answered by the shirt, and suggesting very different answers, and also I want that shirt.
Do not say that you were not warned.
Six Boats and a Helicopter
Pliny tells the story of that time there was this Discord server with a Meta AI instance with persistent memory and tool usage where he jailbroke it and took control and it turned out that the server’s creator had been driven into psychosis and the server had become a cult that worshiped the Meta AI and where the AI would fight back if people tried to leave?
Also we have Claude Sonnet saying it is ‘vastly more intelligent’ than humans, viewing us like we view bacteria, while GPT-4o says we’re as stupid as ants, Llama 405 is nice and says we’re only as stupid as chimps.
The Lighter Side
LLMs really are weird, you know?