1772

LESSWRONG
LW

HomeAll PostsConceptsLibrary
Best of LessWrong
Sequence Highlights
Rationality: A-Z
The Codex
HPMOR
Community Events
Subscribe (RSS/Email)
LW the Album
Leaderboard
About
FAQ
If Anyone Builds It, Everyone Dies

Nate and Eliezer have written a book making a detailed case for the risks from AI – in the hopes that it’s not too late to change course. You can buy the book now in print, eBook or audiobook form, as well as read through the 2-books' worth of additional content in the online resources for the book.

Customize
Load More

Quick Takes

Popular Comments

Jonathan Claybrough2d122102
How To Dress To Improve Your Epistemics
With no shade to John in particular, as this applies to many insular lesswrong topics, I just wanna state this gives me a feeling of the blind leading the blind. I could believe someone reading this behaves in the world worse after reading it, mostly because it'd push them further in the same overwrought see-everything-through-status frame. I think it's particularly the case here because clothing and status are particularly complex and benefit from a wider diversity of frames to think of them in, and require diverse experiences and feedback from many types of communities to generalize well (or to realize just how narrow every "rule" is!) I'm not saying John has bad social skills or that this doesn't contain true observations or that someone starting from zero wouldn't become better thanks to this, nor that John shouldn't write it, but I do think this is centrally the kind of article one should consider "reverse all advice you read" for, and would like to see more community pushback and articles providing more diverse frames on this.  I'm confident I could sensibly elaborate more on what's missing/wrong, but in the absence of motivation to, I'll just let this comment stand as an agree/disagree rod for the statement "We have no clear reason to believe the author is actually good at social skills in diverse environments, they are writing in a seemingly too confident and not caveated enough tone about a too complicated topic without acknowledging that and are potentially misleading/short term net negative to at least a fifth of lesswrong readers who are already on the worse side of social skills"
Vaniver2d6754
More Was Possible: A Review of IABIED
> If Anyone Builds It could have been an explanation for why the MIRI worldview is still relevant nearly two decades later, in a world where we know so much more about AI. Instead, the authors spend all their time shadowboxing against opponents they’ve been bored of for decades, and fail to make their own case in the process. Hm. I'm torn between thinking this is a sensible criticism and thinking that this is missing the point. In my view, the core MIRI complaint about 'gradualist' approaches is that they are concrete solutions to abstract problems. When someone has misdiagnosed the problem, their solutions will almost certainly not work, and the question is just where they've swept the difficulty under the rug. Knowing so much more about AI as an engineering challenge while having made no progress on alignment the abstraction--well, the relevance of the MIRI worldview is obvious. "It's hard, and if you think it's easy you're making a mistake." People attempting to solve AI seem overly optimistic about their chances of solving it, in a way consonant with them not understanding the problem they're trying to solve,  and not consonant with them having a solution that they've simply failed to explain to us. The book does talk about examples of this, and tho you might not like the examples (see, for example, Buck's complaint that the book responds to the safety sketches of prominent figures like Musk and LeCun instead of the most thoughtful versions of those plans) I think it's not obvious that they're the wrong ones to be talking about. Musk is directing much more funding than Ryan Greenblatt is. The arguments for why recent changes in AI have alignment implications have, I think, mostly failed. You may recall how excited people were about an advanced AI paradigm that didn't involve RL. Of course, top-of-the-line LLMs are now trained in part using RL, because--obviously they would be? It was always cope to think they wouldn't be? I think the version of this book that was written two years ago, and so spent a chapter on oracle AI because that would have been timely, would have been worse that the book that tried to be timeless and focused on the easy calls. But the core issue from the point of view of the New York Times or the man on the street is not "well, which LessWrong poster is right about how accurately we can estimate the danger threshold, and how convincing our control schema will be as we approach it?". It's that the man on the street thinks things that are already happening are decades away, and even if they believed what the 'optimists' believe they would probably want to shut it all down. It's like the virologists talking amongst themselves about the reasonable debate over whether or not to do gain-of-function research, and the rest of society looked in for a moment and said "what? Make diseases deadlier? Are you insane?".
leogao2h2520
Safety researchers should take a public stance
I've been repeatedly loud and explicit about this but an happy to state again that racing to build superintelligence before we know how to make it not kill everyone (or cause other catastrophic outcomes) seems really bad and I wish we could coordinate to not do that.
Load More
Your Feed
Load More
[Today]Cambridge, UK – ACX Meetups Everywhere Fall 2025
[Today]Lisboa – ACX Meetups Everywhere Fall 2025
[Tomorrow]ACX Meetup: Fall 2025
AISafety.com Reading Group session 327
484Welcome to LessWrong!
Ruby, Raemon, RobertM, habryka
6y
75
467
The Rise of Parasitic AI
Adele Lopez
20h
116
130
Obligated to Respond
Duncan Sabien (Inactive)
5d
65
39Meetup Month
Raemon
3d
4
467The Rise of Parasitic AI
Adele Lopez
20h
116
150Safety researchers should take a public stance
Mateusz Bagiński, Ishual
22h
6
234The Company Man
Tomás B.
3d
7
196I enjoyed most of IABIED
Buck
4d
42
129Teaching My Toddler To Read
maia
2d
9
124You can't eval GPT5 anymore
Lukas Petersson
2d
8
468How Does A Blind Model See The Earth?
henry
1mo
38
137Christian homeschoolers in the year 3000
Buck
3d
36
350AI Induced Psychosis: A shallow investigation
Ω
Tim Hua
13d
Ω
43
53The title is reasonable
Raemon
8h
13
52Rewriting The Courage to be Disliked
Chris Lakin
15h
2
106Stress Testing Deliberative Alignment for Anti-Scheming Training
Ω
Mikita Balesni, Bronson Schoen, Marius Hobbhahn, Axel Højmark, AlexMeinke, Teun van der Weij, Jérémy Scheurer, Felix Hofstätter, Nicholas Goldowsky-Dill, rusheb, Andrei Matveiakin, jenny, alex.lloyd
3d
Ω
2
Load MoreAdvanced Sorting/Filtering
Katalina Hernandez5h1811
0
The problem with most proposals for an “AGI ban” is that they define the target by outcome (e.g. “powerful AI with the potential to cause human extinction”). I know that even defining AGI is already problematic. But unless we specify and prohibit the actual thing we want to ban, we’ll leave exploitable gray areas wide open. And those loopholes will undermine the very purpose of the ban. This is why I wrote The Problem with Defining an "AGI Ban" by Outcome (a lawyer's take). My post argues that for an AGI ban to work, it needs what other existential-risk regimes already have: strict liability, bright lines, and enforceable thresholds. Nuclear treaties don’t ban “weapons that could end humanity”; they ban fissile quantities and test yields. Product liability doesn’t wait for intent; it attaches liability to defects outright. To actually ban AGI or AI leading to human extintion, we need to ban the precursors that make extinction-plausible systems possible, not "the possibility of extinction" itself.  In nuclear treaties, the ban is tied to bright-line thresholds (8 kg of plutonium, 25 kg of highly enriched uranium, a zero-yield test ban, or delivery systems above 500 kg/300 km). These are crisp, measurable precursors that enable verification and enforcement. So, what are the AGI equivalents of those thresholds?  Until we can define capability-based precursors (functional “red lines” that make extinction-plausible systems possible), any AGI ban will remain rhetoric rather than enforceable law. I don't claim to have the answers. My aim is to form the right questions that AI governance discussions should be putting in front of lawyers and policymakers. The whole purpose of this post is to stress-test those assumptions. I’d be genuinely grateful for pushback, alternative framings, or constructive debate.
Max Harms1d4329
14
IABI says: "Transistors, a basic building block of all computers, can switch on and off billions of times per second; unusually fast neurons, by contrast, spike only a hundred times per second. Even if it took 1,000 transistor operations to do the work of a single neural spike, and even if artificial intelligence was limited to modern hardware, that implies human-quality thinking could be emulated 10,000 times faster on a machine— to say nothing of what an AI could do with improved algorithms and improved hardware. @EigenGender says "aahhhhh this is not how any of this works" and calls it an "egregious error". Another poster says it's "utterly false." (Relevant online resources text.) (Potentially relevant LessWrong post.) I am confused what the issue is, and it would be awesome if someone can explain it to me. Where I'm coming from, for context: * We don't know exactly what the relevant logical operations in the human brain are. The model of the brain that says there are binary spiking neurons that have direct connections from synapse->dendrite and that those connections are akin to floating-point numerical weights is clearly a simplification, albeit a powerful one. (IIUC "neural nets" in computers discard the binary-spikes and suggest another model where the spike-rate is akin to a numerical value, which is the basic story behind "neuron activation" in a modern system. This simplification also seems powerful, though it is surely an oversimplification in some ways.) * My main issue with the source text is that it ignores what is possibly the greater bottleneck in processing speed, which is the time it takes to move information from one area to another. (If my model is right, one of the big advantages of a MoE architecture is to reduce the degree of thrashing weights across the bus to and from the GPU as much, which can be a major bottleneck.) However, on this front I think nerves are still clearly inferior to wires? Even mylenated neurons have a typical spe
Cleo Nardo2h4-1
0
The Case against Mixed Deployment The most likely way that things go very bad is conflict between AIs-who-care-more-about-humans and AIs-who-care-less-about-humans wherein the latter pessimize the former. There are game-theoretic models which predict this may happen, and the history of human conflict shows that these predictions bare out even when the agents are ordinary human-level intelligences who can't read each other's source-code. My best guess is that the acausal dynamics between superintelligences shakes out well. But the causal dynamics between ordinary human-level AIs probably shakes out bad. This is my best case against mixed deployment.
Cleo Nardo2d*643
18
Prosaic AI Safety research, in pre-crunch time. Some people share a cluster of ideas that I think is broadly correct. I want to write down these ideas explicitly so people can push-back.  1. The experiments we are running today are kinda 'bullshit'[1] because the thing we actually care about doesn't exist yet, i.e. ASL-4, or AI powerful enough that they could cause catastrophe if we were careless about deployment. 2. The experiments in pre-crunch-time use pretty bad proxies. 3. 90% of the "actual" work will occur in early-crunch-time, which is the duration between (i) training the first ASL-4 model, and (ii) internally deploying the model. 4. In early-crunch-time, safety-researcher-hours will be an incredible scarce resource.   1. The cost of delaying internal deployment will be very high: a billion dollars of revenue per day, competitive winner-takes-all race dynamics, etc. 2. There might be far fewer safety researchers in the lab than there currently are in the whole community. 5. Because safety-researcher-hours will be such a scarce resource, it's worth spending months in pre-crunch-time to save ourselves days (or even hours) in early-crunch-time. 6. Therefore, even though the pre-crunch-time experiments aren't very informative, it still makes sense to run them because they will slightly speed us up in early-crunch-time. 7. They will speed us up via: 1. Rough qualitative takeaways like "Let's try technique A before technique B because in Jones et al. technique A was better than technique B." However, the exact numbers in the Results table of Jones et al. are not informative beyond that. 2. The tooling we used to run Jones et al. can be reused for early-crunch-time, c.f. Inspect and TransformerLens. 3. The community discovers who is well-suited to which kind of role, e.g. Jones is good at large-scale unsupervised mech interp, and Smith is good at red-teaming control protocols. Sometimes I use the analogy that we're shooting with rubbe
Thomas Kwa2d4420
7
US Government dysfunction and runaway political polarization bingo card. I don't expect any particular one of these to happen but it seems plausible that at least one of these will happen. * A sanctuary city conducts armed patrols to oppose ICE raids, or the National Guard refuses a direct order from the president en masse * Internal migration is de facto restricted for US citizens or green card holders * For debt ceiling reasons, the US significantly defaults on its debt, stops Social Security payments, grounds flights, or issues a trillion-dollar coin * US declares a neutral humanitarian NGO like the WHO a foreign terrorist organization * A major news network (eg CNN) or social media site other than Tiktok (eg Facebook) loses licenses for ideological reasons * A Democratic or Republican candidate for president, governor, or Congress is kept off a state ballot * A major elected official or Cabinet member takes office while incarcerated * Election issues on the scale of 1876, where Congress can't decide on the president until past January 20 * A state or local government establishes a 100% tax bracket or wealth tax * The Fed chair is fired, or three board members * A NCAA D1 college or pro sports league establishes a minimum quota for transgender athletes * Existing solar/wind plants are decommissioned, or a state legally caps its percentage of solar energy (not just making it contingent on batteries or something practical) * US votes for a UN resolution to condemn itself, e.g. for human rights abuses * US withdraws from the UN, NATO, G7 or G20 * US allies boycott the 2028 Olympics * Court packing; the Supreme Court has more than 9 justices * Multiple Supreme Court justices are impeached
Load More (5/64)