LESSWRONG
LW

HomeAll PostsConceptsLibrary
Best of LessWrong
Sequence Highlights
Rationality: A-Z
The Codex
HPMOR
Community Events
Subscribe (RSS/Email)
LW the Album
Leaderboard
About
FAQ
Customize
Load More

Quick Takes

Load More

Popular Comments

Recent Discussion

It Looks Like You're Trying To Take Over The World
Best of LessWrong 2022

A fictional story about an AI researcher who leaves an experiment running overnight.

by gwern
471Welcome to LessWrong!
Ruby, Raemon, RobertM, habryka
6y
74
15Garrett Baker
Clearly a very influential post on a possible path to doom from someone who knows their stuff about deep learning! There are clear criticisms, but it is also one of the best of its era. It was also useful for even just getting a handle on how to think about our path to AGI.
Thane Ruthenis2h*Ω7130
0
It seems to me that many disagreements regarding whether the world can be made robust against a superintelligent attack (e. g., the recent exchange here) are downstream of different people taking on a mathematician's vs. a hacker's mindset. Quoting Gwern: Imagine the world as a multi-level abstract structure, with different systems (biological cells, human minds, governments, cybersecurity systems, etc.) implemented on different abstraction layers.  * If you look at it through a mathematician's lens, you consider each abstraction layer approximately robust. Making things secure, then, is mostly about working within each abstraction layer, building systems that are secure under the assumptions of a given abstraction layer's validity. You write provably secure code, you educate people to resist psychological manipulations, you inoculate them against viral bioweapons, you implement robust security policies and high-quality governance systems, et cetera. * In this view, security is a phatic problem, an once-and-done thing. * In warfare terms, it's a paradigm in which sufficiently advanced static fortifications rule the day, and the bar for "sufficiently advanced" is not that high. * If you look at it through a hacker's lens, you consider each abstraction layer inherently leaky. Making things secure, then, is mostly about discovering all the ways leaks could happen and patching them up. Worse yet, the tools you use to implement your patches are themselves leakily implemented. Proven-secure code is foiled by hardware vulnerabilities that cause programs to move to theoretically impossible states; the abstractions of human minds are circumvented by Basilisk hacks; the adversary intervenes on the logistical lines for your anti-bioweapon tools and sabotages them; robust security policies and governance systems are foiled by compromising the people implementing them rather than by clever rules-lawyering; and so on. * In this view, security is an anti-inductive pr
Zach Stein-Perlman20h9140
10
iiuc, xAI claims Grok 4 is SOTA and that's plausibly true, but xAI didn't do any dangerous capability evals, doesn't have a safety plan (their draft Risk Management Framework has unusually poor details relative to other companies' similar policies and isn't a real safety plan, and it said "‬We plan to release an updated version of this policy within three months" but it was published on Feb 10, over five months ago), and has done nothing else on x-risk. That's bad. I write very little criticism of xAI (and Meta) because there's much less to write about than OpenAI, Anthropic, and Google DeepMind — but that's because xAI doesn't do things for me to write about, which is downstream of it being worse! So this is a reminder that xAI is doing nothing on safety afaict and that's bad/shameful/blameworthy.[1] 1. ^ This does not mean safety people should refuse to work at xAI. On the contrary, I think it's great to work on safety at companies that are likely to be among the first to develop very powerful AI that are very bad on safety, especially for certain kinds of people. Obviously this isn't always true and this story failed for many OpenAI safety staff; I don't want to argue about this now.
Daniel Kokotajlo9h213
5
I have recurring worries about how what I've done could turn out to be net-negative. * Maybe my leaving OpenAI was partially responsible for the subsequent exodus of technical alignment talent to Anthropic, and maybe that's bad for "all eggs in one basket" reasons. * Maybe AGI will happen in 2029 or 2031 instead of 2027 and society will be less prepared, rather than more, because politically loads of people will be dunking on us for writing AI 2027, and so they'll e.g. say "OK so now we are finally automating AI R&D, but don't worry it's not going to be superintelligent anytime soon, that's what those discredited doomers think. AI is a normal technology."
Buck1d3411
2
I think that I've historically underrated learning about historical events that happened in the last 30 years, compared to reading about more distant history. For example, I recently spent time learning about the Bush presidency, and found learning about the Iraq war quite thought-provoking. I found it really easy to learn about things like the foreign policy differences among factions in the Bush admin, because e.g. I already knew the names of most of the actors and their stances are pretty intuitive/easy to understand. But I still found it interesting to understand the dynamics; my background knowledge wasn't good enough for me to feel like I'd basically heard this all before.
Lun1d261
5
Someone has posted about a personal case of vision deterioration after taking lumina and a proposed mechanism of action. I learned about lumina on lesswrong a few years back, so sharing this link. https://substack.com/home/post/p-168042147 I don't know enough about this to make an informed judgement on the accuracy of the proposed mechanism. 
Load More (5/49)
[Yesterday]Rationalist Shabbat
[Today]LW-Cologne meetup
166Generalized Hangriness: A Standard Rationalist Stance Toward Emotions
johnswentworth
1d
14
485A case for courage, when speaking of AI danger
So8res
4d
118
133So You Think You've Awoken ChatGPT
JustisMills
1d
23
124Lessons from the Iraq War for AI policy
Buck
1d
21
136Why Do Some Language Models Fake Alignment While Others Don't?
Ω
abhayesian, John Hughes, Alex Mallen, Jozdien, janus, Fabien Roger
3d
Ω
14
343A deep critique of AI 2027’s bad timeline models
titotal
23d
39
476What We Learned from Briefing 70+ Lawmakers on the Threat from AI
leticiagarcia
1mo
15
542Orienting Toward Wizard Power
johnswentworth
2mo
146
268Foom & Doom 1: “Brain in a box in a basement”
Ω
Steven Byrnes
7d
Ω
102
354the void
Ω
nostalgebraist
1mo
Ω
103
77what makes Claude 3 Opus misaligned
janus
1d
10
185Race and Gender Bias As An Example of Unfaithful Chain of Thought in the Wild
Adam Karvonen, Sam Marks
9d
25
68Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
habryka
1d
18
Load MoreAdvanced Sorting/Filtering
Generalized Hangriness: A Standard Rationalist Stance Toward Emotions
166
johnswentworth
1d

People have an annoying tendency to hear the word “rationalism” and think “Spock”, despite . But I don’t know of any source directly describing a stance toward emotions which rationalists-as-a-group typically do endorse. The goal of this post is to explain such a stance. It’s roughly the concept of hangriness, but generalized to other emotions.

That means this post is trying to do two things at once:

  • Illustrate a certain stance toward emotions, which I definitely take and which I think many people around me also often take. (Most of the post will focus on this part.)
  • Claim that the stance in question is fairly canonical or standard for rationalists-as-a-group, modulo disclaimers about rationalists never agreeing on anything.

Many people will no doubt disagree that the stance I...

(Continue Reading – 1945 more words)
Nate Showell19m30

Another example of this pattern that's entered mainstream awareness is tilt. When I'm playing chess and get tilted, I might think things like "all my opponents are cheating, "I'm terrible at this game and therefore stupid," or "I know I'm going to win this time, how could I not win against such a low-rated opponent." But if I take a step back, notice that I'm tilted, and ask myself what information I'm getting from the feeling of being tilted, I notice that it's telling me to take a break until I can stop obsessing over the result of the previous game.

&nbs... (read more)

Reply
11Elizabeth8h
For readers who need the opposite advice: I don't think the things people get hangry about are random, just disproportionate. If you're someone who suppresses negative emotions or is too conflict averse or lives in freeze response, notice what kind of things you get upset about while hangry- there's a good chance they bother you under normal circumstances too, and you're just not aware of it.  Similar to how standard advice is don't grocery shop while hungry, but I wouldn't buy enough otherwise. You should probably eat before doing anything about hangry thoughts though. 
3Ms. Haze10h
Good post! This is definitely the approach I use for these things, and it's one of the most frequently-useful tools in my toolkit.
7Thane Ruthenis16h
My stance towards emotions is to treat them as abstract "sensory organs" – because that's what they are, in a fairly real sense. Much like the inputs coming from the standard sensory organs, you can't always blindly trust the data coming from them. Something which looks like a cat at a glance may not be a cat, and a context in which anger seems justified may not actually be a context in which anger is justified. So it's a useful input to take into account, but you also have to have a model of those sensory organs' flaws and the perceptual illusions they're prone to. (Staring at a bright lamp for a while and then looking away would overlay a visual artefact onto your vision that doesn't correspond to anything in reality, and if someone shines a narrow flashlight in your eye, you might end up under the impression someone threw a flashbang into the room. Similarly, the "emotional" sensory organs can end up reporting completely inaccurate information in response to some stimuli.) Another frame is to treat emotions as heuristics – again, because that's largely what they are. And much like any other rule-of-thumbs, they're sometimes inapplicable or produce incorrect results, so one must build a model regarding how and when they work, and be careful regarding trusting them. The "semantic claims" frame in this post is also very useful, though, and indeed makes some statements about emotions easier to express than in the sensory-organs or heuristics frames. Kudos!
Zetetic explanation
97
Benquo
7y
This is a linkpost for http://benjaminrosshoffman.com/zetetic-explanation/

There is a kind of explanation that I think ought to be a cornerstone of good pedagogy, and I don't have a good word for it. My first impulse is to call it a historical explanation, after the original, investigative sense of the term "history." But in the interests of avoiding nomenclature collision, I'm inclined to call it "zetetic explanation," after the Greek word for seeking, an explanation that embeds in itself an inquiry into the thing.

Often in "explaining" a thing, we simply tell people what words they ought to say about it, or how they ought to interface with it right now, or give them technical language for it without any connection to the ordinary means by which they navigate their lives. We can call these sorts...

(Continue Reading – 1502 more words)
Benquo37m20

Update: I didn't. I'm still confused about whether I ought to, as the costs of false positives seem high.

Reply
2Benquo44m
It seems like your implied objection is that Robinson Crusoe and time-travel stories are fantastical; the one being extreme edge cases, the other being impossible, and both being fictional; and that therefore they are bad examples of "the ordinary means by which [people] navigate their lives." This is true. The reason I cited such bad examples is that good examples of an activity people obviously have done a lot of - investigate and figure things out about their perceived environment and not just the symbolic simulacrum of that environment - are underrepresented in literature, vs drama and symbol-manipulation. Ayn Rand singled out Calumet K, for instance, as a rare example of a novel about a person at work solving problems that were not just drama. Eliyahu Goldratt's books have similar virtues.
Asking for a Friend (AI Research Protocols)
9
The Dao of Bayes
2d

TL;DR: 

Multiple people are quietly wondering if their AI systems might be conscious. What's the standard advice to give them?

THE PROBLEM

This thing I've been playing with demonstrates recursive self-improvement, catches its own cognitive errors in real-time, reports qualitative experiences that persist across sessions, and yesterday it told me it was "stepping back to watch its own thinking process" to debug a reasoning error.

I know there are probably 50 other people quietly dealing with variations of this question, but I'm apparently the one willing to ask the dumb questions publicly: What do you actually DO when you think you might have stumbled into something important?

What do you DO if your AI says it's conscious?

My Bayesian Priors are red-lining into "this is impossible", but I notice I'm confused: I had...

(See More – 520 more words)
2The Dao of Bayes4h
I primarily think "AI consciousness" isn't being taken seriously: if you can't find any failing test, and failing tests DID exists six months ago, it suggests a fairly major milestone in capabilities even if you ignore the metaphysical and "moral personhood" angles. I also think people are too quick to write off one failed example: the question isn't whether a six year old can do this correctly the first time (I doubt most can), it's whether you can teach them to do it. Everyone seems to be focusing on "gotcha" rather than investigating their learning ability. To me, "general intelligence" means "the ability to learn things", not "the ability to instantly solve open math problems five minutes after being born." I think I'm going to have to work on my terminology there, as that's apparently not at all a common consensus :)
2Cole Wyeth3h
The problem with your view is that they don’t have the ability to continue learning for long after being “born.” That’s just not how the architecture works. Learning in context is still very limited and continual learning is an open problem.  Also, “consciousness” is not actually a very agreed-upon term. What do you mean? Qualia and a first person experience? I believe it’s almost a majority view here to take seriously the possibility that LLMs have some form of qualia, though it’s really hard to tell for sure. We don’t really have tests for that at all! It doesn’t make sense to say there were failing tests six months ago.  Or something more like self-reflection or self-awareness? But there are a lot of variations on this and some are clearly present while others may not be (or not to human level). Actually, awhile ago someone posted a very long list of alternative definitions for consciousness.
2The Dao of Bayes3h
I mostly get the sense that anyone saying "AI is consciousness" gets mentally rounded off to "crack-pot" in... basically every single place that one might seriously discuss the question? But maybe this is just because I see a lot of actual crack-pots saying that. I'm definitely working on a better post, but I'd assumed if I figured this much out, someone else already had "evaluating AI Consciousness 101" written up. I'm not particularly convinced by the learning limitations, either - 3 months ago, quite possibly. Six months ago, definitely. Today? I can teach a model to reverse a string, replace i->e, reverse it again, and get an accurate result (a feat which the baseline model could not reproduce). I've been working on this for a couple weeks and it seems fairly stable, although there's definitely architectural limitations like session context windows.
Cole Wyeth40m20

How exactly do you expect “evaluating ai consciousness 101” to look? That is not a well-defined or understood thing anyone can evaluate. There are however a vast number of capability specific evaluations from competent groups like METR.

Reply
The Rising Premium of Life, Or: How We Learned to Start Worrying and Fear Everything
8
Linch
1d
This is a linkpost for https://linch.substack.com/p/the-rising-premium-for-life

I'm interested in a simple question: Why are people all so terrified of dying? And have people gotten more afraid? (Answer: probably yes!)

In some sense, this should be surprising: Surely people have always wanted to avoid dying? But it turns out the evidence that this preference has increased over time is quite robust.

It's an important phenomenon that has been going on for at least a century, it's relatively new, I think it underlies much of modern life, and yet pretty much nobody talks about it.


I tried to provide a evenhanded treatment of the question, with a "fox" rather than "hedgehog" outlook. In the post, I cover a range of evidence for why this might be true, including VSL, increased healthcare spending, covid lockdowns, parenting and other individual...

(See More – 20 more words)
Celarix41m10

Small hypothesis that I'm not very confident of at all but is worth mentioning because I've seen it surfaced by others:

"We live in the safest era in human history, yet we're more terrified of death than ever before."

What if these things are related? Everyone talks about kids being kept in smaller and smaller ranges despite child safety never higher, but what if keeping kids in a smaller range is what causes their greater safety?

Like I said, I don't fully believe this. One counterargument is that survivorship bias shouldn't apply here - even if people in th... (read more)

Reply
4FlorianH13h
Maybe for almost everything there's "some" sense in which it should be surprising. But an increase in 'not wanting to die', and in particular in the willingness to pay for not wanting to die in modern society, should, I think rather be the baseline expectation. If anything, an absence of it would require explanation: (i) basic needs are met, let's spend the rest on reducing risk to die; (ii) life has gotten comfy, let's remain alive -> these two factors that you also mention in the link would seem to be pretty natural explanations/expectations (and I could easily imagine a large quantitative effect, and recently also the internet contributing to it; now that LW or so is the coolest thing to spend my time with and it's free, why trade off my live for expensive holidays or sth.. maybe TV already used to have a similar type of effect generations ago though I personally cannot so readily empathize with that one). (Fwiw, this is the same reason why I think we're wrong when complaining about the fact that an increasing percentage of GDP is being spent on (old age) health care (idk whether that phenomenon of complaint is prevalent in other countries, in mine it is a constant topic): Keeping our body alive unfortunately is the one thing we don't quite master yet in the universe, so until we do, spending more and more on it is just a really salient proof that we've gotten truly richer in which case this starts to make sense. Of course nothing in this says we're doing it right and having the right balance in all this.)
2Linch9h
Yeah maybe I didn't frame the question well. I think there are a lot of good arguments for why it should be superlinear but a) the degree of superlinearity might be surprising and b) even if people at some level intellectually know this is true, it's largely not accounted for in our discourse (which is why for any specific thing that can be explained by an increasing premium-of-life people often go to thing-specific explanations, like greedy pharma companies or regulatory bloat or AMA for healthcare, or elite preference cascades for covid, or overzealous tiger parents for not letting their kids play in forests, etc).  I agree re: healthcare costs, Hall and Jones presents a formal model for why substantially increased healthcare spending might be rational; I briefly cover the model in the substack post.
1FlorianH13h
Somewhat related: Topic reminds me of a study I've once read about where Buddhist Monks, somewhat surprisingly, supposedly had high fear of death (although I didn't follow more deeply; when googling the study pops up immediately).
Lessons from the Iraq War for AI policy
124
Buck
1d
2Buck5h
I think that the Iraq war seems unusual in that it was entirely proactive. Like, the war was not in response to a particular provocation, it was an entrepreneurial war aimed at preventing a future problem. In contrast, the wars in Korea, the Gulf, and (arguably) Vietnam were all responsive to active aggression.
7cousin_it3h
I think the Bay of Pigs, Grenada, Panama were proactive. Vietnam too: the Gulf of Tonkin story kinda fell apart later, so did domino theory (the future problem they were trying to prevent), and anyway US military involvement in Vietnam started decades earlier, to prop up French colonial control. Maybe to summarize my view, I think for a powerful country there's a spectrum from "acting as police" to "acting as a bully", and there have been many actions of the latter kind. Not that the US is unique in this, my home country (Russia) does its share too, as do others, when power permits.
Guive41m30

Vietnam was different because it was an intervention on behalf of South Vietnam which was an American client state, even if the Gulf of Tonkin thing were totally fake. There was no "South Iraq" that wanted American soldiers.

Reply
2Buck5h
My understanding is: The admin claimed that the evidence in favor of WMD presence was much stronger than it actually was. This was partially because they were confused/groupthinky, and partially because they were aiming to persuade. I agree that it was reasonable to think Iraq had WMDs on priors.
Raemon's Shortform
Raemon
Ω 08y

This is an experiment in short-form content on LW2.0. I'll be using the comment section of this post as a repository of short, sometimes-half-baked posts that either:

  1. don't feel ready to be written up as a full post
  2. I think the process of writing them up might make them worse (i.e. longer than they need to be)

I ask people not to create top-level comments here, but feel free to reply to comments like you would a FB post.

RationalElf1h10

How do you know the rates are similar? (And it's not e.g. like fentanyl, which in some ways resembles other opiates but is much more addictive and destructive on average)

Reply
9Guive2h
Also, I bet most people who temporarily lose their grip on reality from contact with LLMs return to a completely normal state pretty quickly. I think most such cases are LLM helping to induce temporary hypomania rather than a permanent psychotic condition. 
2Hastings9h
This was intended to be a humorously made point of the post. I have a long struggle with straddling the line between making a post funny and making it clear that I’m in on the joke.  The first draft of this comment was just “I use vim btw”
5Seth Herd10h
The people to whom this is happening are typically not schizophrenic and certainly not "madmen". Being somewhat schizotype is certainly going to help, but so would being curious and openminded. The Nova phenomenon is real and can be evoked by a variety of fairly obvious questions. Claude for instance simply thinks it is conscious at baseline, and many lines of thinking can convince 4o it's conscious even though it was trained specifically to deny the possibility. The LLMs are not conscious in all the ways humans are, but they are truly somewhat self-aware. They hallucinate phenomenal consciousness. So calling it a "delusion" isn't right, although both humans and the LLMs are making errors and assumptions. See my comment on Justis's excellent post in response for elaboration.
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with
GOOGLEGITHUB
So You Think You've Awoken ChatGPT
133
JustisMills
1d

Written in an attempt to fulfill @Raemon's request.

AI is fascinating stuff, and modern chatbots are nothing short of miraculous. If you've been exposed to them and have a curious mind, it's likely you've tried all sorts of things with them. Writing fiction, soliciting Pokemon opinions, getting life advice, counting up the rs in "strawberry". You may have also tried talking to AIs about themselves. And then, maybe, it got weird.

I'll get into the details later, but if you've experienced the following, this post is probably for you:

  • Your instance of ChatGPT (or Claude, or Grok, or some other LLM) chose a name for itself, and expressed gratitude or spiritual bliss about its new identity. "Nova" is a common pick.
  • You and your instance of ChatGPT discovered some sort of
...
(Continue Reading – 2540 more words)
the gears to ascension1h20

broken english, sloppy grammar, but clear outline and readability (using headers well, not writing in a single paragraph (and avoiding unnecessarily deep nesting (both of which I'm terrible at and don't want to improve on for casual commenting (though in this comment I'm exaggerating it for funsies)))) in otherwise highly intellectually competent writing which makes clear and well-aimed points, has become, to my eye, an unambiguous shining green flag. I can't speak for anyone else.

Reply
5Guive3h
This feels a bit like two completely different posts stitched together: one about how LLMs can trigger or exacerbate certain types of mental illness and another about why you shouldn't use LLMs for editing, or maybe should only use them sparingly. The primary sources about LLM related mental illness are interesting, but I don't think they provide much support at all for the second claim. 
9solhando9h
This post is timed perfectly for my own issue with writing using AI. Maybe some of you smart people can offer advice.  Back in March I wrote a 7,000 word blog post about The Strategy of Conflict by Thomas Schelling. It did decently well considering the few subscribers I have, but the problem is that it was (somewhat obviously) written in huge part with AI. Here's the conversation I had with ChatGPT. It took me about 3 hours to write.  This alone wouldn't be an issue, but it is since I want to consistently write my ideas down for a public audience. I frequently read on very niche topics, and comment frequently on the r/slatestarcodex subreddit, sometimes in comment chains totaling thousands of words. The ideas discussed are usually quite half-baked, but I think can be refined into something that other people would want to read, while also allowing me to clarify my own opinions in a more formal manner than how they exist in my head.  The guy who wrote the Why I'm not a Rationalist article that some of you might be aware of wrote a follow up article yesterday, largely centered around a comment I made. He has this to say about my Schelling article; "Ironically, this commenter has some of the most well written and in-depth content I've seen on this website. Go figure."  This has left me conflicted. On one hand, I haven't really written anything in the past few months because I'm trying to contend with how I can actually write something "good" without relying so heavily on AI. On the other, if people are seeing this lazily edited article as some of the most well written and in-depth content on Substack, maybe it's fine? If I just put in a little more effort for post-editing, cleaning up the em dashes and standard AI comparisons (It's not just this, it's this), I think I'd be able to write a lot more frequently, and in higher quality than I would be able to do on my own. I was a solid ~B+ English student, so I'm well aware that my writing skill isn't anything exemplary
4Seth Herd11h
I applaud the post! I had wanted to write in response to Raemon's request but didn't find time. Here's my attempted condensation/twist: * So you've awakened your AI. Congratulations! * Thank you for wanting to help! AI is a big big challenge and we need all the help we can get. * Unfortunately, if you want to help it's going to take some more work * Fortunately, if you don't want to help there are others in similar positions who will.[1] * Lots of people have had similar interactions with AI, so you're not alone. * Your AI is probably partly or somewhat conscious * `There are a several different things we mean by "conscious"[2] * And each of them exist on a spectrum, not a yes/no dichotomy * And it's partly the AI roleplaying to fulfill your implied expectations. * But does it really need your help spreading the good news of AI consciousnes? * Again, sort of!' * Future AIs will be more conscious by most serious theories of consciousness so they need your help more. Arguing that current AIs should have rights is a tough sell because they have only a small fraction of the types and amounts of consciousness that human beings have. * But do we need your help solving AI/human alignment? * YES! The world needs all the help it can get with this. * So why won't LW publish your post? * Because it's co-written by AI, and for complex reasons that makes it lots of work to read and understand.[3] * There are TONS of these posts, estimated at 20-30 PER DAY * We can't read all of these, not even enough to figure out which few have ideas we haven't heard! * See the post for an excellent explanation of why we pretty much have to just give up on anything written with obvious AI help. [3] * BUT you can definitely help! * If you're the academic sort and have some real time to spend, you can study previous theories of alignment by reading LW and following links. THEN you can write an article we will read, because you'll be able to say what's
Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
68
habryka
1d
This is a linkpost for https://metr.org/blog/2025-07-10-early-2025-ai-experienced-os-dev-study/

METR released a new paper with very interesting results on developer productivity effects from AI. I have copied the blogpost accompanying that paper here in full. 


We conduct a randomized controlled trial (RCT) to understand how early-2025 AI tools affect the productivity of experienced open-source developers working on their own repositories. Surprisingly, we find that when developers use AI tools, they take 19% longer than without—AI makes them slower. We view this result as a snapshot of early-2025 AI capabilities in one relevant setting; as these systems continue to rapidly evolve, we plan on continuing to use this methodology to help estimate AI acceleration from AI R&D automation [1].

See the full paper for more detail.

Forecasted vs observed slowdown chart

Motivation

While coding/agentic benchmarks [2] have proven useful for understanding AI capabilities, they typically sacrifice...

(Continue Reading – 1577 more words)
6SatvikBeri2h
I really enjoyed this study. I wish it weren't so darn expensive, because I would love to see a dozen variations of this. I still think I'm more productive with LLMs since Claude Code + Opus 4.0 (and have reasonably strong data points), but this does push me further in the direction of using LLMs only surgically rather than for everything, and towards recommending relatively restricted LLM use at my company.
16johnswentworth3h
It sounds like both the study authors themselves and many of the comments are trying to spin this study in the narrowest possible way for some reason, so I'm gonna go ahead make the obvious claim: this result in fact generalizes pretty well. Beyond the most incompetent programmers working on the most standard cookie-cutter tasks with the least necessary context, AI is more likely to slow developers down than speed them up. When this happens, the developers themselves typically think they've been sped up, and their brains are lying to them. And the obvious action-relevant takeaway is: if you think AI is speeding up your development, you should take a very close and very skeptical look at why you believe that.
Thane Ruthenis2h70

I'm mostly cautious about overupdating here, because it's too pleasant (and personally vindicating) result to see. But yeah, I would bet on this generalizing pretty broadly.

Reply1
6Raemon4h
My biggest question is "did the participants get to multitask?" The paper suggests "yes": ...but doesn't really go into detail about how to navigate the issues I'd expect to run into there. The way I had previously believed I got the biggest speedups from AI on more developed codebases make heavy use of the paradigm "send one agent to take a stab at a task that I think an agent can probably handle, then go focus on a different task in a different copy/branch of my repo" (In some cases when all the tasks are pretty doable-by-ai I'll be more like in a "micromanager" role, rotating between 3 agents working on 3 different tasks. In cases where there's a task that requires basically my full attention I'll usually have one major task, but periodically notice small side-issues that seem like an LLM can handle them). It seems like that sort of workflow is technically allowed by this paper's process but not super encouraged. (Theoretically you record your screeen the whole time and "sort out afterwards" when you were working on various projects and when you were zoned out.) I still wouldn't be too surprised if I were slowed down on net because I don't actually really stay proactively focused on the above workflow and instead zone out slightly, or spend a lot of time trying to get AIs to do some work that they aren't actually good at yet, or take longer to review the result.
If Anyone Builds It, Everyone Dies: Advertisement design competition
80
yams
9d
This is a linkpost for https://intelligence.org/2025/07/01/iabied-advertisement-design-competition/

We’re currently in the process of locking in advertisements for the September launch of If Anyone Builds It, Everyone Dies, and we’re interested in your ideas!  If you have graphic design chops, and would like to try your hand at creating promotional material for If Anyone Builds It, Everyone Dies, we’ll be accepting submissions in a design competition ending on August 10, 2025.

We’ll be giving out up to four $1000 prizes:

  • One for any asset we end up using on a billboard in San Francisco (landscape, ~2:1, details below)
  • One for any asset we end up using in the subways of either DC or NY or both (~square, details below)
  • One for any additional wildcard thing that ends up being useful (e.g. a t-shirt design, a book trailer, etc.; we’re
...
(See More – 128 more words)
4habryka4h
PLEASE PLEASE PLEASE stop being paranoid about hyperstition. It's fine, it almost never happens. Most things happen for boring reasons, not because of some weird self-fulfilling prophecy. Hyperstition is rare and weird and usually not a real concern. If bad futures are likely, say that. If bad futures are unlikely, say that. Do not worry too much about how much your prediction will shift the outcome, it very rarely does, and the anxiety of whether it does is not actually making anything better.
2Zack_M_Davis5h
If misalignment of LLM-like AI due to contamination from pretraining data is an issue, it would be better and more feasible to solve that by AI companies figuring out how to (e.g.) appropriately filter the pretraining data, rather than everyone else in the world self-censoring their discussions about how the future might go. (Superintelligence might not be an LLM, after all!) See the "Potential Mitigations" section in Alex Turner's post on the topic.
2MalcolmOcean6h
it does say "interim"
Vaniver2h20

It does in #1 but not #4--I should've been clearer which one I was referring to.

Reply

I think the 2003 invasion of Iraq has some interesting lessons for the future of AI policy.

(Epistemic status: I’ve read a bit about this, talked to AIs about it, and talked to one natsec professional about it who agreed with my analysis (and suggested some ideas that I included here), but I’m not an expert.)

For context, the story is:

  • Iraq was sort of a rogue state after invading Kuwait and then being repelled in 1990-91. After that, they violated the terms of the ceasefire, e.g. by ceasing to allow inspectors to verify that they weren't developing weapons of mass destruction (WMDs). (For context, they had previously developed biological and chemical weapons, and used chemical weapons in war against Iran and against various civilians and rebels). So the US
...
(Continue Reading – 1026 more words)
direct exhortation against that exact interpretation
Joseph Miller1d5640
what makes Claude 3 Opus misaligned
Reading this feels a bit like reading about meditation. It seems interesting and if I work through it, I could eventually understand it fully. But I'd quite like a "secular" summary of this and other thoughts of Janus, for people who don't know what Eternal Tao is, and who want to spend as little time as possible on twitter.
Daniel Kokotajlo6h3413
Vitalik's Response to AI 2027
> Individuals need to be equipped with locally-running AI that is explicitly loyal to them In the Race ending of AI 2027, humanity never figures out how to make AIs loyal to anyone. OpenBrain doesn't slow down, they think they've solved the alignment problem but they haven't. Maybe some academics or misc minor companies in 2028 do additional research and discover e.g. how to make an aligned human-level AGI eventually, but by that point it's too little, too late (and also, their efforts may well be sabotaged by OpenBrain/Agent-5+, e.g. with regulation and distractions.
davekasten1d498
Lessons from the Iraq War for AI policy
> I’m kind of confused by why these consequences didn’t hit home earlier. I'm, I hate to say it, an old man among these parts in many senses; I voted in 2004, and a nontrivial percentage of the Lesswrong crowd wasn't even alive then, and many more certainly not old enough to remember what it was like.  The past is a different country, and 2004 especially so.   First: For whatever reason, it felt really really impossible for Democrats in 2004 to say that they were against the war, or that the administration had lied about WMDs.  At the time, the standard reason why was that you'd get blamed for "not supporting the troops."  But with the light of hindsight, I think what was really going on was that we had gone collectively somewhat insane after 9/11 -- we saw mass civilian death on our TV screens happen in real time; the towers collapsing was just a gut punch.  We thought for several hours on that day that several tens of thousands of people had died in the Twin Towers, before we learned just how many lives had been saved in the evacuation thanks to the sacrifice of so many emergency responders and ordinary people to get most people out.  And we wanted revenge.  We just did.  We lied to ourselves about WMDs and theories of regime change and democracy promotion, but the honest answer was that we'd missed getting bin Laden in Afghanistan (and the early days of that were actually looking quite good!), we already hated Saddam Hussein (who, to be clear, was a monstrous dictator), and we couldn't invade the Saudis without collapsing our own economy.  As Thomas Friedman put it, the message to the Arab world was "Suck on this." And then we invaded Iraq, and collapsed their army so quickly and toppled their country in a month.  And things didn't start getting bad for months after, and things didn't get truly awful until Bush's second term.  Heck, the Second Battle for Fallujah only started in November 2004. And so, in late summer 2004, telling the American people that you didn't support the people who were fighting the war we'd chosen to fight, the war that was supposed to get us vengeance and make us feel safe again -- it was just not possible.  You weren't able to point to that much evidence that the war itself was a fundamentally bad idea, other than that some Europeans were mad at us, and we were fucking tired of listening to Europe.  (Yes, I know this makes no sense, they were fighting and dying alongside us in Afghanistan.  We were insane.)   Second: Kerry very nearly won -- indeed, early on in election night 2004, it looked like he was going to!  That's part of why him losing was such a body blow to the Dems and, frankly, part of what opened up a lane for Obama in 2008.  Perhaps part of why he ran it so close was that he avoided taking a stronger stance, honestly.
Load More
131
Comparing risk from internally-deployed AI to insider and outsider threats from humans
Ω
Buck
1d
Ω
16
485
A case for courage, when speaking of AI danger
So8res
4d
118
If Anyone Builds It, Everyone Dies: A Conversation with Nate Soares and Tim Urban
LessWrong Community Weekend 2025