"QC: the water usage argument against LLMs is extremely bad faith..."
Rule of thumb, if you're reading some article and the water usage is measured in gallons, it's bullsh*t. This isn't even that, ml, seriously?! My very normal suburban house water bill doesn't even measure in gallons, it measures in hundreds of cubic feet (at about $5/per hundred cubic feet). People do not realize how vast the quantities of water humans regularly divert and move around are (and how very little of it is required to directly sustain human life).
Formatting is still kind of bad, and is affecting readability. It's been a couple of posts in a row now with long wall of text paragraphs. I feel like you changed something? And you should change it back. :)
Yeah, we gotta fix something about handling the Substack formatted content. It really looks ugly sometimes, though I haven't yet chased down when.
Apparently the new ChatGPT model is obsessed with the immaculate conception of Mary
I mean, "shoggoth" is not that far off from biblically accurate angels... ;-)
Table of Contents
Previously this week: o3 Will Use Its Tools For You, o3 Is a Lying Liar, You Better Mechanize.Language Models Offer Mundane Utility
Patrick McKenzie uses image generation to visualize the room his wife doesn’t want, in order to get her to figure out and explain what she does want so they can do it. Altman is correct that current ChatGPT (and Gemini and Claude and so on) are rather great and vastly better than what we were first introduced to in December 2022, and the frog boiling has meant most people haven’t internalized the improvements. Help rewrite your laws via AI-driven regulation, UAE edition? Deedy recommends the Claude Code best practices guide so you can be a 10x AI software engineer. The skills to get the most out of AI coding are different from being the best non-AI coder. One recommendation he highlights is to use 3+ git checkouts in seperate folders, put each in a distinct terminal and have each do different tasks. If you’re waiting on an AI, that’s a sign you’re getting it wrong. There’s also this thread of top picks, from Alex Albert. Get an explanation of the idiom you just made up out of random words. Get far better insights out of your genetic test. Explain whatever your phone is looking at. This commonly gets into demos and advertisements. Patrick McKenzie reports frequently actually doing this. I don’t do it that often yet, but when I do it often directly solves a problem or gives me key info. Debug your printer. Solve an actually relevant-to-real-research complex mathematical problem.You Offer the Models Mundane Utility
How much of your future audience is AIs rather than humans, interface edition. Not so fast. But the day is coming soon when you need to cater to both audiences. I love the idea of a ‘copy the AI-optimized version of this to the clipboard’ button. This is easy for the AI to identify and use, and also for the human to identify and use. And yes, in many cases you can and should make it easy for them. This is also a clear case of ‘why not both?’ There’s no need to get rid of the cool website designed for humans. All the AI needs is that extra button.Your Daily Briefing
A bunch of people are reporting that getting o3-compiled daily summaries are great. Still a few glitches to work out sometimes, including web search sometimes failing:Language Models Don’t Offer Mundane Utility
One easy way to not get utility is not to know you can ask for it. It used to feel very important to know how to do relatively bespoke prompt engineering. Now the models are stronger, and mostly you can ‘just say what you want’ and it will work out fine for most casual purposes. That still requires people to realize they can do that. A lot of us have had that conversation where we had to explain this principle to someone 5 times in a row and they didn’t believe us. Another way is being fooled into thinking you shouldn’t use it ‘because climate.’ Could AI end up doing things like venting the atmosphere or boiling the oceans once the intelligence explosion gets out of hand? Yes, but that is a very different issue. Concerns about water use from chatbot queries are Obvious Nonsense. Could this add up to something if you scale and are spending millions or billions to do queries? Sure, in that case these impacts are non-trivial. If a human is reading the output, the cost is epsilon (not technically zero, but very very close) and you can safety completely ignore it. Again, 150+ ChatGPT queries use as much water as one almond. Here’s another concern I think is highly overrated, and that is not about AI. This is the same as the original Gell-Mann Amnesia. The problem with GMA while reading a newspaper, or talking to a human, is exactly the same. The danger comes when AI answers become social weapons, ‘well Grok said [X] so I’m right,’ but again that happens all the time with ‘well the New York Times said [X] so I’m right’ all the way down to ‘this guy on YouTube said [X].’ You have to calibrate. And indeed, Mike is right to say thank you. The AI is giving you a much better answer than you could get on your own. No, it’s not perfect, but it was never going to be.If You Want It Done Right
When AI is trying to duplicate exactly the thing that previously existed, Pete Koomen points out, it often ends up not being an improvement. The headline example is drafting an email. Why draft an email with AI when the email is shorter than the prompt? Why explain what you want to do if it would be easier to do it yourself? And why the hell won’t GMail let you change its de facto system prompt for drafting its email replies? A lot of the answer is Pete is asking GMail to draft the wrong emails. In his example:- The email he wants is one line long, the prompt would be longer.
- He knows what he wants the email to say.
- He knows exactly how he wants to say it.
In cases where those aren’t true, the AI can be a lot more helpful. The central reason to have AI write email is to ‘perform class’ or ‘perform email.’ It needs the right tone, the right formalities, to send the right signals and so on. Or perhaps you simply need to turn a little text into a lot of text, perhaps including ‘because of reasons’ but said differently. Often people don’t know how to do this in a given context, either not knowing what type of class to perform or not knowing how to perform it credibly. Or even when they do know, it can be slow and painful to do – you’d much prefer to write out what you want to say and then have it, essentially, translated. Luckily for Pete, he can write his boss a one-line simple statement. Not everyone is that lucky. Pete wants a new email system designed to automate email, rather than jamming a little feature into the existing system. And that’s fair, but it’s a power user move, almost no one ever changes settings (even though Pete and I and probably you do all the time) so it makes sense that GMail isn’t rushing to enable this. If you want that, I haven’t tried it yet, but Shortwave exists.No Free Lunch
Another classic way to not get utility from AI is being unwilling to pay for it. Unfortunately, we have all been trained by mobile to expect everything to have a free tier, and to mostly want things to be free. Then we sink massive amounts of time into ‘free’ things when a paid option would make our lives vastly better. Your social media and dating apps and mobile games being free is terrible for you. I am so so happy that the AIs I use are entirely on subscription business models. One time payments (or payment for upgrades) would be even better.What Is Good In Life?
So much to unpack here. A lot of it is a very important question: What is good in life? What is the reason we read a book, query an AI or otherwise seek out information? There’s also a lot of people grasping at straws to explain why AI wouldn’t be massively productivity enhancing for them. Bottleneck! It’s definitely not 30 IQ points (yet!), and much more like enhancing productivity. If you can create a good version of that system, that’s pretty amazing for when you need a particular piece of information and can figure out what it is. One needs to differentiate between when you actually want specific knowledge, versus when you want general domain understanding and to invest in developing new areas and skills. Even if you don’t, it is rather crazy to have your first thought when you need information be ‘read a book’ rather than ‘ask the AI.’ It is a massive hit to your functional intelligence and productivity. Reading a book is a damn inefficient way to extract particular information, and also generally (with key exceptions) a damn inefficient way to extract information in general. But there’s also more to life than efficiency. The drinking coffee for 58 minutes (the new ‘my code’s compiling!’) is the issue here. Who is to say what he is actually doing, I’d be shocked if the real TFID spends that hour relaxing. The OP needs to circle back in 58 minutes because the program takes an hour to run. I bet this hour is usually spent checking email, setting up other AI queries, calling clients and so on. But if it’s spent relaxing, that seems great too. Isn’t that the ‘liberal urge’ too, to not have to work so hard? Obviously one can take it too far, but it seems rather obvious this is not one of those cases: Also, for setting up and having that whole AI system. This would be quite the setup. But yes, fundamentally, this division of labor seems about right.In Memory Of
So many more tokens of please and thank you, so much missing the part that matters? Well, yes, actually? Figuring out how to prompt humans is huge. More generally, if you want Janus-level results of course script kiddies were always ngmi, but compared to script kiddies most people are ngmi. The script kiddies at least are trying to figure out how to get good outputs. And there’s huge upside in being able to replicate results, in having predictable outputs and reproducible experiments and procedures. The world runs on script kiddies, albeit under other names. We follow fixed procedures. A lot. Then, yes, some people can start doing increasingly skilled improv that breaks all those rules, but one thing at a time. Even if that exact prompt won’t work the same way for you, because context is different, that prompt is often extremely insightful and useful to see. I probably won’t copy it directly but I totally want you to send prompt. That is in addition to the problem that memory entangles all your encounters. Have a little faith in yourself, and how the AIs will interpret you. You can delete chats, in the cases where you decide they send the wrong message – I recently realized you can delete parts of your YouTube history the same way, and that’s been very freeing. I no longer worry about the implications of clicking on something, if I don’t like what I see happening I just go back and delete stuff. However: Increasingly, you don’t know that any given AI chat, yours or otherwise, is ‘objective,’ unless it was done in ‘clean’ mode via incognito mode or the API. Nor does your answer predict what other people’s answers will be. It is a definite problem.The Least Sincere Form of Flattery
Others are doing some combination of customizing their ChatGPTs in ways they did not have in mind, and not liking the defaults they don’t know how to overcome. I presume that if it talks to you like that you’re now supposed to reply to give it the feedback to stop doing that, on top of adjusting your custom instructions.The Vibes are Off
When we invent a much faster and cheaper but lower quality option, the world is usually better off. However this is not a free lunch. The quality of the final product goes down, and the experience of the artisan gets worse as well. How you relate to ‘vibe coding’ depends on which parts you’re vibing versus thinking about, and which parts you enjoy versus don’t enjoy. For me, I enjoy architecting, I think it’s great, but I don’t like figuring out how to technically write the code, and I hate debugging. So overall, especially while I don’t trust the AI to do the architecting (or I get to do it in parallel with the AI), this all feels like a good trade. But the part where I have to debug AI code that the AI is failing to debug? Even more infuriating. The other question is whether skipping the intellectual exercises means you lose or fail to develop skills that remain relevant. I’m not sure. My guess is this is like most educational interactions with AI – you can learn if you want to learn, and you avoid learning if you want to avoid learning. Another concern is that vibe coding limits you to existing vibes. This is true of many automation tools. They make a subclass of things go very fast, but not other things. So now anything you can pierce together from that subclass is easy and fast and cheap, and other things remain hard and slow and expensive.Here Let Me AI That For You
What changes when everyone has a Magic All-Knowing Answer Box? I think this is good, actually. I can mention concepts or use references, knowing that if someone doesn’t know what I’m talking about and feels the need to know, they can ask Claude. If it’s a non-standard thing and Claude doesn’t know, then you can ask. And also you can still ask anyway, it’s usually going to be fine. The other advantage of asking is that it helps calibrate and establish trust. I’m letting you know the limits of my knowledge here, rather than smiling and nodding. I strongly encourage my own readers to use Ask Claude (or o3) when something is importantly confusing, or you think you’re missing a reference and are curious, or for any other purpose.Flash Sale
Gemini Flash 2.5 Exists in the Gemini App and in Google AI Studio. It’s probably great for its cost and speed. Within the Gemini family of models, I am inclined to believe the relative Arena scores. As in, looking at this chart, it suggests Gemini 2.5 Flash is roughly halfway between 2.5 Pro and 2.0 Flash. That is highly credible. You can set a limit to the ‘thinking budget’ from 0 to 24k tokens.Huh, Upgrades
Alex Lawsen reports that Gemini 2.5 has substantially upgraded NotebookLM podcasts, and recommends this prompt (which you can adapt for different topics): I don’t think I would ever want a podcast for this purpose, but at some point on the quality curve perhaps that changes. OpenAI doubles rate limits for o3 and o4-mini-high on plus (the $20 plan). Gemma 3 offers an optimized quantized version designed to run on a desktop GPU. Grok now generates reports from uploaded CSV data if you say ‘generate a report.’ Grok also now gives the option to create workspaces, which are its version of projects. OpenAI launches gpt-image-1 so that you can use image gen in the API.On Your Marks
There is an image geolocation eval, in which the major LLMs are well above human baseline. It would be cool if this was one of the things we got on new model releases. Here’s a new Vending Machine Eval, sort of. It’s called Vending-Bench. You have $500 in virtual cash and 2000 messages, make as much money as possible stocking the vending machine, GPT-4o intercepts your emails to businesses and writes the replies.Be The Best Like No LLM Ever Was
Gemini 2.5 continues to extend its lead in the Pokemon Eval, but there is a catch. Gemini is almost certainly going to win, and probably soon. This is close to the end. o3 thinks the cognitively hard parts are over, so there’s an 80% chance it goes all the way, almost always by 700 hours. I’d worry a little about whether it figures out to grind enough for the Elite Four given it probably has a lousy lineup, but it’s probably fine. I tried to check in on it, but the game status was failing for some reason. This is different from playing Pokemon well. There was a proposed bet from Lumenspace about getting an LLM to win in human time, but their account got deleted, presumably before that got finalized. This is obviously possible if you are willing to give it sufficient pokemon-specific guidance, the question is if you can do it without ‘cheating’ in this way. Which raises the question of whether Gemini is cheating. It kind of is? Then there are some key differences in what is made available, and it seems to me that Gemini has some rather important advantages. Joel Z, the unaffiliated-with-Google creator of Gemini Plays Pokemon, explicitly says that the harnesses are different enough that a direct comparison can’t be made, and that the difference is probably largely from the agent frameworks. Google employees, of course, are not letting that stop them. The lift key hint is an interesting special case, but I’ll allow it. This is all in good fun, but to compare models they need to use the same agent framework. That’s the only way we get a useful benchmark.Choose Your Fighter
Peter Wildeford gives his current guide to when to use which model. Like me, he’s made o3 his default. But it’s slow, expensive, untrustworthy, a terrible writer, not a great code writer, can only analyze so much text or video, and lacks emotional intelligence. So sometimes you want a different model. That all sounds correct. I wasn’t amazed. I would say I was impressed, but also it’s a lying liar. Here’s a problem I didn’t anticipate. When Claude first got web search I was thrilled, and indeed I found it highly useful. A reasonably large percentage of my AI queries do require web search, as they depend on recent factual questions, or I need it to grab some source. I’ve yet to be tempted to turn it off. o3 potentially changes that. o3 is much better at web search tasks than other models. If I’m going to search the web, and it’s not so trivial I’m going to use Google Search straight up, I’m going to use o3. But now that this is true, if I’m using Claude, the chances are much lower that the query requires web search. And if that’s true, maybe by default I do want to turn it off?Deepfaketown and Botpocalypse Soon
And the faker is you? There is now a claimed (clearly bait and probably terrible or fake, link and name are intentionally not given) new AI app with the literal tagline ‘We want to cheat on everything.’ It’s supposed to be ‘completely undetectable AI assistance’ including for ‘every interview, from technical to behavioral,’ or passing quizzes, exams and tests. This is all text. It hides the window from screen sharing. It at least wants to be a heads-up display on smart glasses, not only a tab on a computer screen. Here is a both typical and appropriate response: On the other hand this reminds me of ‘blatant lies are the best kind.’ The thing is, even if it’s not selling itself this way, and it takes longer to be good enough than you would expect, of course this is coming, and it isn’t obviously bad. We can’t be distracted by the framing. Having more information and living in an AR world is mostly a good thing most of the time, especially for tracking things like names and your calendar or offering translations and meanings and so on. It’s only when there’s some form of ‘test’ that it is obviously bad. The questions are, what are you or we going to do about it, individually or collectively, and how much of this is acceptable in what forms? And are the people who don’t do this going to have to get used to contact lenses so no one suspects our glasses? I also think this is the answer to: You too can have eidetic memory, by being a human with an AI and an AR display. Anthropic is on the lookout for malicious use, and reports on their efforts and selected examples from March. The full report is here. Overall, nothing too surprising. Alas, even if identified, there isn’t that much Anthropic on its own can do to shut such actors out even from frontier AI access, and there’s nothing definitely to stop them from using open models.Fun With Media Generation
I’m with Strasky here. Marvel would rather not spend the $1.5 million to try and get some sort of special effects ‘moat.’ This seems very obviously good for creativity. If anything, the reason there might be serious trouble would be the temptation to make movies with too many special effects. Also, frankly, this will get solved soon enough regardless but the original versions are better. I’m not saying $1.5 million better, but for now the replacements would make the movie noticeably worse. The AI version lacks character, as it were.Fun With Media Selection
This is a great trick, but it’s not the trick I was hoping for. An automatic system to pull new material and sort by critical feedback is great. My note would be that for movies Metacritic and Letterboxd seem much better than Rotten Tomatoes and IMDb, but for TV shows Metacritic is much weaker and IMDb is a good pick. The real trick is to personalize this beyond a genre. LLMs seem strong at this, all you should have to do is get the information into context or memory. With all chats in accessible memory this should be super doable if you’ve been tracking your preferences, or you can build it up over time. Indeed, you can probably ask o3 to tell you your preferences – you’d pay to know what you really think, and you can correct the parts where you’re wrong, or you want to ignore your own preferences.Copyright Confrontation
Meta uses the classic Sorites (heap) paradox to argue that more than 7 million books have ‘no economic value.’ If anything that number is stunningly high. You’re telling me each book can give several basis points (hundreths of a percent) improvement? Do you know how many books there are? Clearly at least seven million. The alternative explanation is ‘0.06% means the measurements were noise’ and okay, sure, each individual book probably doesn’t noticeably improve performance, classic Sorites paradox. The other arguments here seem to be ‘it would be annoying otherwise’ and ‘everyone is doing it.’ The annoyance claim, that you’d have to negotiate with all the authors and that this isn’t practical, is the actually valid argument. That’s why I favor a radio-style rule where permission is forced but so is compensation.They Took Our Jobs
I always find it funny when the wonders of AI are centrally described by things like ‘running 24/7.’ That’s a relatively minor advantage, but it’s a concrete one that people can understand. But obviously if knowledge work can run 24/7, then even if no other changes that’s going to add a substantial bump to economic growth. Hollis Robbins joins the o3-as-education-AGI train. She notes that there Ain’t No Rule about who teaches the lower level undergraduate required courses, and even if technically it’s a graduate student, who are we really kidding at this point? Especially if students use the ‘free transfer’ system to get cheap AI-taught classes at community college (since you get the same AI either way!) and then seamlessly transfer, as California permits students to do. Hollins points out you could set this up, make the lower half of coursework fully automated aside from some assessments, and reap the cost savings to solve their fiscal crisis. She is excited by this idea, calling it flagship innovation. And yes, you could definitely do that soon, but is the point of universities to cause students to learn as efficiently as possible? Or is it something very different? Hollins is attempting to split the baby here. The first half of college is about basic skills, so we can go ahead and automate that, and then the second half is about something else, which has to be provided by places and people with prestige. Curious. When one thinks ahead to the next step, once you break the glass and show that AI can handle the first half of the coursework, what happens to the second half? For how long would you be able to keep up the pretense of sacrificing all these years on the altar of things like ‘human mentors’ before we come for those top professors too? It’s not like most of them even want to be actually teaching in the first place. A job the AI really should take, when will it happen? I do note it is not quite as easy as a better error rate, because this is adversarial. You need your errors to be importantly unpredictable. If there is a way I can predictably fool the AI, that’s a dealbreaker for relying fully on the AI even if by default the AI is a lot better. You would then need a mixed strategy. Rather than thinking in jobs, one can think in types of tasks. Here is one intuition pump: Tobias is making an interesting claim. While o3 can do the ‘draw disparate sources’ thing, it still hasn’t been doing the ‘make new connections and discoveries’ thing in a way that provides clear examples – hence Dwarkesh Patel and others continuing to ask about why LLMs haven’t made those unique new connections and discoveries yet. Abhi is using ‘always’ where he shouldn’t. The ‘best people’ eventually lose out too in the same way that they did in chess or as calculators. There’s a step in between, they hang on longer, and ‘be the best human’ still can have value – again, see chess – but not in terms of the direct utility of the outputs. What becomes valuable when AI gets increasingly capable is the ability to extract those capabilities from the AI, to know which outputs to extract and how to evaluate them, or to provide complements to AI outputs. Basic professional training for now can still be extremely valuable even if AI is now ‘doing your job for you,’ because that training lets you evaluate the AI outputs and know which ones to use for what task. One exciting twist is that this ‘basic professional training’ will also increasingly be available from the AI, or the AI will greatly accelerate such training. I’ve found that to be true essentially everywhere, and especially in coding.Get Involved
Elysian Labs is hiring for building Auren, if you apply do let them know I should get that sweet referral bonus.Ace is the Place
Introducing Ace, a real time computer autopilot. For many applications, speed kills. It is still early, so I don’t have any feedback to report on whether it’s good or safe. They don’t explain here what precautions are being taken with an agent that is using your own keyboard and mouse and ‘all the tools on your computer.’ The future is going to involve things like this, but how badly do you want to go first? Ace isn’t trying to solve the general case so much as they are trying to solve enough specific cases they can string together? They are using behavioral cloning, not reinforcement learning. For the reasons described this makes me feel better about the whole enterprise. If you want to create a computer autopilot, 99%+ of what we do on computers is variations on the same set of actions. So if you want to make something useful to help users save time, it makes sense to directly copy those actions. No, this doesn’t scale as far, but that’s fine here and in some ways it’s even a feature. I think of this less as ‘this is how we get computer AGI’ as ‘we don’t need computer AGI to build something highly useful.’ But Sherjil is claiming this can go quite a long way: Also, Sherjil Ozair notes they do most of their hiring via Twitter DMs.In Other AI News
Gemini Pro 2.5 Model Card Watch continues. It is an increasingly egregious failure for them not to have published this. Stop pretending that labeling this ‘experimental’ means you don’t have to do this.Show Me the Money
Goodfire raises a $50 million series A for interpretability, shows a preview of its ‘universal neural programming platform’ Ember. OpenAI in talks to buy Windsurf. There are many obvious synergies. The data is valuable, the vertical integration brings various efficiencies, and it’s smart to want to incorporate a proper IDE into OpenAI’s subscription offerings, combine memory and customization across modalities and so on. Do you think this is going to be how all of this works? Do you want it to be? Bonus question: Do you think that is democracy? That is will do what the users want?The Mask Comes Off
A new private letter to the two key Attorney Generals urges them to take steps to prevent OpenAI from converting to a for-profit, as it would wipe out the nonprofit’s charitable purpose. That purpose requires the nonprofit retain control of OpenAI. The letter argues convincingly against allowing the conversion.Quiet Speculations
Dwarkesh patel asks 6k words worth of mostly excellent questions about AI, here’s a Twitter thread. Recommended. I’m left with the same phenomenon I imagine my readers are faced with: There’s too many different ways one could respond and threads one could explore and it seems so overwhelming most people don’t respond at all. A worthy response would be many times the length of the original – it’s all questions, it’s super dense. Also important are what questions are missing. So I won’t write a full response directly. Instead, I’ll be drawing from it elsewhere and going forward. Things are escalating quickly. Yet Sam Altman continues to try and spin that Nothing Ever Happens. That sounds nice but does not make any sense. What would generate that outcome? The top AI companies are on different release cycles. One needs to not overreact: It doesn’t even require that the new release actually be a superior model. Don’t forget that couple of days when Grok was super impressive and everyone was saying how fast it was improving, or the panic over Manus, or how everyone massively overreacted to DeepSeek’s r1. As always, not knocking it, DeepSeek cooked and r1 was great, a reaction was warranted, but r1 was still behind the curve and the narratives around it got completely warped and largely still are, in ways that would be completely different if we’d understood better or if r1 had happened two weeks later. An argument that GPT-4.5 was exactly what you would expect from scaling laws, but GPT-4.5’s post training wasn’t as good as other models, so its performance is neither surprising nor a knock on further scaling. We have found more profit for now on other margins, but that will change, and then scaling will come back. Helen Toner lays out the case for a broad audience that the key unknown in near term AI progress is reward specification. What areas beyond math and coding will allow automatically graded answers? How much can we use performance in automatically graded areas to get spillover into other areas?Is This AGI?
By the OpenAI definition we very clearly do not have AGI, even if we include only work on computers. It seems rather silly to claim otherwise. You can see how we might get there relatively soon. I can see how Near’s statement could be made (minus several 9s) about 2025-2030 as a range, and at least be a reasonable claim. But wow, to have your range include two years in the past seems rather nutty, even for relatively loose definitions. Ethan Mollick calls o3 and Gemini 2.5 Pro ‘jagged AGI’ in that they have enough superhuman capabilities to result in real changes to how we work and live. By that definition, what about good old Google search? We basically agree on the practical question of what models at this level will do on their own, that they will change a ton of things but it will take time to diffuse.The Quest for Sane Regulations
BIS (the Bureau of Industry and Security), which is vital to enforcing the export controls, is such a no-brainer that even AI czar David Sacks supports it while advocating for slashing every other bureaucracy. Given the stakes are everyone dies or humanity loses control, international cooperation seems like something we should be aiming for.Cooperation is Highly Useful
Miles Brundage and Grace Werner make the case that America First Meets Safety First, that the two are fully compatible and Trump is well positioned to make a deal (he loves deals!) with China to work together on catastrophic risks and avoid the threat of destabilizing the situation, while both sides continue to engage in robust competition. They compare this to the extensive cooperation between the USA and USSR during the cold war to guard against escalation, and point out that China could push much harder on AI than it is currently pushing, if we back them into a corner. They present a multi-polar AI world as inevitable, so we’ll need to coordinate to deal with the risks involved, and ensure our commitments are verifiable, and that we’re much better off negotiating this now while we have the advantage. We also have a paper investigating where this cooperation is most practical. They and many others have a new paper on which areas of technical AI safety allow for geopolitical rivals to cooperate.Nvidia Chooses Bold Strategy
Nvidia continues to play what look like adversarial games against the American government. They at best are complying with the exact letter of what they are legally forced to do, and they are flaunting this position, while also probably turning a blind eye to smuggling. It’s possible Nvidia is reading everyone involved into this behind the scenes, they understand and it’s the right move, but man does it not look like a good idea.How America Loses
America’s government is working hard to alienate its allies and former allies, and making them question whether we might try to leverage their use of American technology. It is not surprising that those nations want to stop relying on American technology, along with everything else. If we want our top companies to succeed, they need markets where they will be allowed and encouraged to sell their goods, and where customers want to buy them. Driving away the world’s second largest market? Not a smart move.Security Is Capability
Many have long had this fever dream that Anthropic will lose because of their ‘obsession with safety.’ This seems exactly backwards. Only Anthropic is investing anything like an appropriate amount into its alignment and safety work, and it shows, including in the form of in many ways a superior user experience. There is a faction of people who hate the idea of responsible AI development or worrying about AI risks on principle, and that therefore wants to tell a story where Anthropic are the bad guys, doomed to fail or both. It’s so absurd, and a big tell. And indeed, the big problem With Claude 3.7 in particular (but not 3.5 or 3.6) is not that it is ‘too safe’ or ‘too aligned’ or something, but rather the opposite! Early on, there was some reason for concern about Anthropic playing ‘too safe,’ as Claude was refusing too often out of an (over) abundance of caution on various fronts, and they had at least soft promises not to push the frontier. Anthropic got over all that. I can’t remember the last time Claude gave me a stupid refusal. Lots of people use Claude Sonnet for coding, and they almost never complain about refusals, I instead hear about 3.7 doing things it shouldn’t. The ‘alignment tax’ has on the corporate margin proven reliably negative. Yes, you pay such a ‘tax’ when you in particular want to impose restrictions on content type, but this enables deployment and use in many contexts. That’s not where most of the difficult work lies, which is in ensuring the model does what you want it to do, in the ways you’d want to do it. That’s a place everyone is dramatically underinvesting, with the possible exception of Anthropic. Reliability is super valuable. The biggest sufferer seems to be Google. Google has very strong brand concerns about what it needs to not allow, but it hasn’t done enough investment on how to ‘gracefully’ deal with those restrictions. That creates a frustrating and jarring experience in many ways, that has greatly impacted use rates for Gemini.The Week in Audio
I once again go on The Cognitive Revolution. Odd Lots on what really happened with the CHIPS act. DeepMind CEO Demis Hassabis on 60 Minutes. This is of interest to see how 60 Minutes is reacting to and how Demis is presenting AI, although this is an example of how an interview ‘doesn’t capture any alpha’ in the sense that there’s no question here that Demis hasn’t been asked before. The main demo is of Project Astra. All the AI demonstrations here are very easy things like identifying paintings, but if it’s all new to you then difficulty and impressiveness won’t correlate much. The second half as Demis saying actual things. Demis gives 5-10 year timeline to AGI, 10 years to ‘curing all disease,’ expects useful robots soon and mentions that Google is working on a Gemini-based agent like Operator. At the end, Demis mentions loss of control risks, and they discuss that the race to AGI ‘might become a race to the bottom for safety’ and the need for international coordination, and tough on other big questions. Demis Hassabis continues to be by far the lab leader publicly talking the most responsibly and well about AI, in sharp contrast to Sam Altman, Dario Amodei, Elon Musk or Mark Zuckerberg. This is a solid 9/10, facing up to the reality of the situation. Don’t take this the wrong way, Yann, but I think LLMs were done with you first.AI 2027
The AI Futures Project has a new blog, with contributions by Daniel Kokotajlo and Scott Alexander. Self-recommending. The most recent post is Why America Wins in the AI 2027. It comes down to compute, America will have the vast majority of the compute, China can’t catch up in time to be relevant to 2027 and energy won’t yet be a limiting factor. Talent probably favors America as well, but even if that changed over the medium term, it wouldn’t ultimately much matter because AI is becoming the talent. This framework emphasizes how important the export controls on chips are. Scott Alexander also has an excellent roundup post of responses to and engagements with AI 2027, and responses to many critical objections not addressed in Why America Wins. Here’s all the podcasts, I previously only covered Dwarkesh and Win Win:- NYT’s Hard Fork (Daniel)
- Glenn Beck (Daniel)
- Win Win (Daniel)
- Control AI interview (Eli)
- Dwarkesh Patel (Daniel and Scott)
- Lawfare (Daniel and Eli)
The post also links to a number of YouTube videos and text responses. This is what proper engagement with critics looks like. More of this, everyone.Rhetorical Innovation
Andrew Critch continues to argue against terms like ‘solve alignment,’ ‘the alignment problem’ and ‘aligned AI,’ saying they are importantly misleading and ‘ready to be replaced by clearer discourse.’ He favors instead speaking of ‘aligned with whom,’ and that you can ‘solve the alignment problem’ and still end up with failure because you chose the wrong target. I get where this is coming from. There is indeed a big problem where people think ‘oh if we ‘solve alignment’ then we win’ whereas no, that is only part of the problem, exactly because of the question of ‘aligned to what or whom’ and the resulting interplay of different systems. And I certainly do worry about alignment thus being a potential ‘2-place’ word, and this dismissing the second part of the problem, where we have to choose the [X] to align the AI to from among different perspectives and preferences. However, I worry more that the tendency is instead dismiss the first half of the problem, which is how to cause the AIs to be aligned to [X], for our choice of [X]. This includes not knowing how to formally specify a plausible target, but also not knowing how to get there. The default is to assume the real fight and difficulties will be over choosing between different preferences for [X] and who gets to choose [X]. Alas, while that fight is real too, I believe this to be very false. I think that we don’t know how to align an AI to [X], for any plausible good choice for [X], sufficiently well to survive under strain, in addition to not knowing good enough long-term candidates for [X]. And I think that if we make a push to say that ‘solve alignment’ isn’t the right target that people will interpret this as either ‘alignment is solved or will be easily solved,’ which I think is a false and very harmful takeaway, or that we don’t need to figure out how to align a model to a given [X], which would likely be even worse. Thus I favor keeping ‘solve alignment’ as centrally meaning ‘be able to get the model to do what you want,’ the ‘alignment problem’ being how to do that, and by default an ‘aligned AI’ being AI that was successfully aligned to where we want it aligned in context, despite the dangers of confusion here. But I do think that when saying ‘aligned AI’ we should specify ‘aligned to what’ if that is otherwise ambiguous (e.g. ‘The Netflix recommendation AI is aligned to Netflix short term KPIs, but misaligned to Netflix users and also to Netflix.’) A simple explanation of one aspect (far from all) of a big problem: A confusing-to-me post by several people including Joel Leibo and Seb Krier suggests moving ‘beyond alignment’ into a ‘patchwork quilt of human coexistence.’ Thinking about it more and reading the comments only makes it more confusing. How is this ‘patchwork quilt of human coexistence’ going to be how the world works when there are lots of things more capable than humans running around, especially with highly flexible value sets and goals? The proposed world has no steering or coordination mechanisms, and its AIs get aligned to a variety of local goals many of which are either malicious or ill-considered or myopic or selfish or all of those. You might get a ‘patchwork quilt of AI coexistence’ if you’re very lucky, but the humans are not going to be making any of the decisions for long. You can’t move ‘beyond alignment’ in this way until you first solve alignment, as in have the ability to get an AI to do what you want. And even if you did get the ability to then move ‘beyond’ alignment in this way, the solution of diffusion with highly adjustable-to-local-requests values is a recipe for not-so-gradual gradual disempowerment, not a patchwork quilt of human coexistence. They also call for ‘discarding the axiom of rational convergence’ but I think that’s a confluence of ‘world models and beliefs converge at the limit’ which is what we mean by this and is obviously true, and ‘values converge at the limit’ which is disputed but which most of us agree is not true, and that I strongly think is not true.Aligning a Smarter Than Human Intelligence is Difficult
You need to ensure your models are aligned during internal deployment, and eventually also during training, not only after external deployment. Here are two lukewarm takes I mostly agree with: The techniques that we know about that reduce deception in current models get less effective as the models move towards being superhuman. So I’m not that excited to push on them hard, as this would largely hide the bigger problems and fool people into thinking the problems were easy. But we do want non-deceptive models now. Apollo offers the new report “AI Behind Closed Doors: a Primer on The Governance of Internal Deployment” What is the relationship between work to ensure humans stay alive and in control over the future (AI notkilleveryoneism, or AI safety), versus concerns about AI welfare? How can we protect humanity and our own goals in futures while also navigating that some AIs may become moral patients? One attitude is that these goals are in direct conflict. Robert Long (good morning, sir!) points out this need not be true. A lot of the key work advances both goals. Understand seems clearly helpful for both goals. We can’t watch out for AI welfare unless we understand them. Align also seems clearly helpful for both goals. If we fail to align AIs and the welfare of AIs matters, the outcome seems awful for them in all scenarios, as the AIs are likely instantiated in an essentially random fashion with random preferences and goals. That’s unlikely to be all that optimal for them. This isn’t a zero sum game. Cooperate then becomes an obvious third thing to do, once it becomes a sensible thing to consider, if we can figure out how to do that wisely. Most importantly, to succeed at both causes, we need to think ahead to the places where there could be conflict. Otherwise we’ll be forced to choose, and might end up choosing neither. One real worry is that AI welfare concerns end up stopping us from taking the necessary steps to preserve human survival or control over the future, potentially without those concerns having much link to actual AI welfare. Another real worry is that if we don’t solve how to stay alive and in control in a nice way, we’ll choose to do it in a not-nice way. I am skeptical of AI welfare concerns, but if that turns out to be wrong, we really don’t want to be wrong about it. The most important reason to work on AI welfare is to figure out whether, when and how we should (or should not) be concerned about AI welfare. That includes the possibility that we need to be very concerned, and that those concerns might turn out to be incompatible with avoiding humanity’s demise. If that’s true, it’s vitally important that we figure that out now. And then, perhaps, consider the option to coordinate to not build those AIs in the first place.Misalignment in the Wild
They are recruiting Research Engineers and Research Scientists for their Societal Impacts team. This makes me want to be able to turn these knobs. It also makes me wonder how well we could explicitly turn those knobs from the prompt. What happens if I end my request with ‘Emphasize the value of reputation management’? I presume this has too many side effects and is thus too noisy? But you can’t run the experiment properly by only looking at the output. Judging from the blog post, the obvious caution is that we cannot assume the values expressions we measure have that much to do with what the system is outputting what values that reflects, on multiple levels. It’s cool information, but I’d want blind spot checks by humans to see if the outputs reflect the values the AI is claiming to be expressing and caring about, ideally for each of the expressed values, and then also a wider check to see if an AI can reverse engineer them as well.Concentration of Power and Lack of Transparency
If you give humans too much ability to steer the future, they might use it. If you don’t give humans enough ability to steer the future, they can’t use it. If we can’t vest our ability to coordinate to steer the future in our democratic institutions, where do we vest it instead? If it exists, it has to exist somewhere, and any system of humans can be hijacked either by some of those humans or by AI. A lot of people are so worried about concentration of power, or human abuse of power, that they are effectively calling for anarchism, for humans to not have the power to steer the future at all. Calling for full diffusion of top ASI capabilities, or for cutting off the ability of governments to steer the future, is effectively calling for the (actually rather rapid) gradual disempowerment of humans, likely followed by their end, until such time as someone comes up with a plan to avoid this. I have yet to see such a plan that has an above-epsilon (nontrivial) chance of working. A lot of those same people are simultaneously so worried about government in particular that they support AI labs being permitted to develop superintelligence, AIs more capable than humans, entirely in secret. They don’t think AI labs should be disclosing what they are doing and keeping the public in the loop at all, including but not limited to the safety precautions being taken or not taken. I strongly believe that even if you think all the risk is in too much concentration of power, not calling for strong transparency is a large mistake. If you don’t trust the government here, call for that transparency to extend to the public as well. The only reason to not call for strong transparency is if you want some combination of the future ASIs and the labs that create them to be the ones that steer the future. One can certainly make the case that those labs have the expertise, and that alternative options would definitely mess this up, so trusting the labs is the best bet, but that is the case you are effectively making if you go down this road.Property Rights are Not a Long Term Plan
There’s no ‘but’ here. The key insight is that property rights are always and everywhere unstable. They are frequently disrupted, and even where they mostly hold you need to be able to defend your property against various forms of taxation and predation, that respond to the nature of the wealth around them. Wealth that is insufficiently justified and defended will be lost, even if unspent. When there is a politically disempowered group with wealth, that wealth tends not to last for that many generations. This does not bode well for any future where the human plan is ‘own capital.’It Is Risen
Okay, even by 2025 standards this is excellent trolling. As the person sharing this said, ‘fast takeoff confirmed.’The Lighter Side
Not only is it real, it is from July 2024. So maybe slow takeoff?