LESSWRONG
LW

HomeAll PostsConceptsLibrary
Best of LessWrong
Sequence Highlights
Rationality: A-Z
The Codex
HPMOR
Community Events
Subscribe (RSS/Email)
LW the Album
Leaderboard
About
FAQ
Customize
Load More

Quick Takes

Load More

Popular Comments

Recent Discussion

Excerpts from a larger discussion about simulacra
Best of LessWrong 2019

Ben and Jessica discuss how language and meaning can degrade through four stages as people manipulate signifiers. They explore how job titles have shifted from reflecting reality, to being used strategically, to becoming meaningless.

This post kicked off subsequent discussion on LessWrong about

by Benquo
simulacrum levels.
471Welcome to LessWrong!
Ruby, Raemon, RobertM, habryka
6y
74
Joseph Miller1d5640
what makes Claude 3 Opus misaligned
Reading this feels a bit like reading about meditation. It seems interesting and if I work through it, I could eventually understand it fully. But I'd quite like a "secular" summary of this and other thoughts of Janus, for people who don't know what Eternal Tao is, and who want to spend as little time as possible on twitter.
Daniel Kokotajlo7h3615
Vitalik's Response to AI 2027
> Individuals need to be equipped with locally-running AI that is explicitly loyal to them In the Race ending of AI 2027, humanity never figures out how to make AIs loyal to anyone. OpenBrain doesn't slow down, they think they've solved the alignment problem but they haven't. Maybe some academics or misc minor companies in 2028 do additional research and discover e.g. how to make an aligned human-level AGI eventually, but by that point it's too little, too late (and also, their efforts may well be sabotaged by OpenBrain/Agent-5+, e.g. with regulation and distractions.
davekasten1d498
Lessons from the Iraq War for AI policy
> I’m kind of confused by why these consequences didn’t hit home earlier. I'm, I hate to say it, an old man among these parts in many senses; I voted in 2004, and a nontrivial percentage of the Lesswrong crowd wasn't even alive then, and many more certainly not old enough to remember what it was like.  The past is a different country, and 2004 especially so.   First: For whatever reason, it felt really really impossible for Democrats in 2004 to say that they were against the war, or that the administration had lied about WMDs.  At the time, the standard reason why was that you'd get blamed for "not supporting the troops."  But with the light of hindsight, I think what was really going on was that we had gone collectively somewhat insane after 9/11 -- we saw mass civilian death on our TV screens happen in real time; the towers collapsing was just a gut punch.  We thought for several hours on that day that several tens of thousands of people had died in the Twin Towers, before we learned just how many lives had been saved in the evacuation thanks to the sacrifice of so many emergency responders and ordinary people to get most people out.  And we wanted revenge.  We just did.  We lied to ourselves about WMDs and theories of regime change and democracy promotion, but the honest answer was that we'd missed getting bin Laden in Afghanistan (and the early days of that were actually looking quite good!), we already hated Saddam Hussein (who, to be clear, was a monstrous dictator), and we couldn't invade the Saudis without collapsing our own economy.  As Thomas Friedman put it, the message to the Arab world was "Suck on this." And then we invaded Iraq, and collapsed their army so quickly and toppled their country in a month.  And things didn't start getting bad for months after, and things didn't get truly awful until Bush's second term.  Heck, the Second Battle for Fallujah only started in November 2004. And so, in late summer 2004, telling the American people that you didn't support the people who were fighting the war we'd chosen to fight, the war that was supposed to get us vengeance and make us feel safe again -- it was just not possible.  You weren't able to point to that much evidence that the war itself was a fundamentally bad idea, other than that some Europeans were mad at us, and we were fucking tired of listening to Europe.  (Yes, I know this makes no sense, they were fighting and dying alongside us in Afghanistan.  We were insane.)   Second: Kerry very nearly won -- indeed, early on in election night 2004, it looked like he was going to!  That's part of why him losing was such a body blow to the Dems and, frankly, part of what opened up a lane for Obama in 2008.  Perhaps part of why he ran it so close was that he avoided taking a stronger stance, honestly.
Load More
17Benquo
There are two aspects of this post worth reviewing: as an experiment in a different mode of discourse, and as a description of the procession of simulacra, a schema originally advanced by Baudrillard. As an experiment in a diffferent mode of discourse, I think this was a success on its own terms, and a challenge to the idea that we should be looking for the best blog posts rather than the behavior patterns that lead to the best overall discourse. The development of the concept occurred over email quite naturally without forceful effort. I would have written this post much later, and possibly never, had I held it to the standard of "written specifically as a blog post." I have many unfinished drafts. emails, tweets, that might have advanced the discourse had I compiled them into rough blog posts like this. The description was sufficiently clear and compelling that others, including my future self, were motivated to elaborate on it later with posts drafted as such. I and my friends have found this schema - especially as we've continued to refine it - a very helpful compression of social reality allowing us to compare different modes of speech and action. As a description of the procession of simulacra it differs from both Baudrillard's description, and from the later refinement of the schema among people using it actively to navigate the world.  I think that it would be very useful to have a clear description of the updated schema from my circle somewhere to point to, and of some historical interest for this description to clearly describe deviations from Baudrillard's account. I might get around to trying to draft the former sometime, but the latter seems likely to take more time than I'm willing to spend reading and empathizing with Baudrillard. Over time it's become clear that the distinction between stages 1 and 2 is not very interesting compared with the distinction between 1&2, 3, and 4, and a mature naming convention would probably give these more natural
15Zvi
This came out in April 2019, and bore a lot of fruit especially in 2020. Without it, I wouldn't have thought about the simulacra concept and developed the ideas, and without those ideas, I don't think I would have made anything like as much progress understanding 2020 and its events, or how things work in general.  I don't think this was an ideal introduction to the topic, but it was highly motivating regarding the topic, and also it's a very hard topic to introduce or grok, and this was the first attempt that allowed later attempts. I think we should reward all of that.
[Today]LW-Cologne meetup
[Today]OC ACXLW Meetup: “Platforms, AI, and the Cost of Progress” – Saturday, July 12 2025  98ᵗʰ weekly meetup
If Anyone Builds It, Everyone Dies: A Conversation with Nate Soares and Tim Urban
LessWrong Community Weekend 2025
Take Precautionary Measures Against Superhuman AI Persuasion
7
Yitz
2m

Please consider minimizing direct use of AI chatbots (and other text-based AI) in the near-term future, if you can. The reason is very simple: your sanity may be at stake.

Commercially available AI already appears capable of inducing psychosis in an unknown percentage of users. This may not require superhuman abilities: It’s fully possible that most humans are also capable of inducing psychosis in themselves or others if they wish to do so,[1] but the thing is, we humans typically don’t have that goal.

 Despite everything, we humans are generally pretty well-aligned with each other, and the people we spend the most time with typically don’t want to hurt us. We have no guarantee of this for current (or future) AI agents. Rather, we already have [weak] evidence that ChatGPT...

(See More – 291 more words)
Isopropylpod's Shortform
Isopropylpod
1mo
Morpheus3m10

And on a more micro-level, living knowing that I and everyone else have one year left to live, and that it's my fault, sounds utterly agonizing.

Earlier you say:

or frankly even if anyone who continues to exist after I die has fun or not or dies or not, because I will be dead, and at that point, from my prospective, the universe may as well not exist anymore.

How are these compatible? You don't care if all other humans die after you die unless you are responsible?

Reply
Vitalik's Response to AI 2027
42
Daniel Kokotajlo
8h
This is a linkpost for https://vitalik.eth.limo/general/2025/07/10/2027.html

Daniel notes: This is a linkpost for Vitalik's post. I've copied the text below so that I can mark it up with comments.


...


Special thanks to Balvi volunteers for feedback and review

In April this year, Daniel Kokotajlo, Scott Alexander and others released what they describe as "a scenario that represents our best guess about what [the impact of superhuman AI over the next 5 years] might look like". The scenario predicts that by 2027 we will have made superhuman AI and the entire future of our civilization hinges on how it turns out: by 2030 we will get either (from the US perspective) utopia or (from any human's perspective) total annihilation.

In the months since then, there has been a large volume of responses, with varying perspectives on how...

(Continue Reading – 3551 more words)
Lukas Finnveden42m20

Importantly, if there are multiple misaligned superintelligences, and no aligned superintelligence, it seems likely that they will be motivated and capable to coordinate with each other to overthrow humanity and divide the spoils.

Reply
2Lukas Finnveden45m
I think the argument against that the military thing is supposed to be item 1 on the list. I.e.: The argument is that there might not be a single Consensus-1 controlled military even in the US. I think it seems unlikely that the combined US AI police forces will be able to compete with the US AI national military, which is one reason I'm skeptical of this. Still, if "multiple independent militaries" would solve the problem, we could potentially push for that happening inside the national military. It seems plausible to me that the government will want multiple companies to produce AI for their military systems, so we could well end up with different AI military units run by different AI system. The more fundamental problem is that, even if the different AIs have entirely different development histories, they may all end up misaligned. And if they all end up misaligned, they may collude to overthrow humanity and divide the spoils. I'm all for attempts to make this more difficult. (That's the kind of thing that the AI control agenda is trying to do.) But as the AIs get more and more superhuman, it starts to seem extremely hard to prevent all their opportunities at collusion.
1StanislavKrym2h
I agree that the assumption about building one hegemon is bad. Indeed, I considered the possibility that OpenBrain and some rivals create their versions of Agent-3 and end up having them co-research. Were one of them to care about humans, it could decide to do things like implanting the worry into the successor or whistleblowing to the humans by using transparent AIs trained in a similar environment.  In addition, the multipolar scenario is made more plausible because the ARC-AGI-2 leaderboard has the models o3, Claude 4 Opus and Grok 4 who were released in the interval of three months and have begun to tackle the benchmark. Unfortunately, Grok already faces major alignment problems.[1] There also is the diffusion-based architecture which threatens to undermine transparency.  On the other hand, I think that the AI companies might become merged due to the Taiwan invasion instead of misalignment. OpenBrain might also fail to catch the misaligned Agent-4 if Agent-2 or Agent-3 collude[2] with Agent-4.  1. ^ What Musk tried to achieve was a right-wing chatbot trained on the Internet. My theory would be that right-wing posts in the Anglosphere are usually overly provocative, the emergently misaligned persona is based off Internet trolls. A right-wing finetuned AI, like an evil-finetuned one, is kicked off the original persona through the "Average Right-Winger" persona into the Troll Persona. For comparison, DeepSeek has no such problems. If it is asked in Russian, then the answers are non-provocative and more politically conservative than if DeepSeek is asked in English.  2. ^ My reasoning was that Agent-2 could already end up adversarially misaligned, but my scenario has the AIs since Agent-2 care about humans in a different way. The AIs, of course, do their best to align the successor to their ideas instead of the hosts' ideas.  
3yonathan_arbel6h
On this part: " I agree with this actually" We need to dig deeper into what open source AI is mostly like in practice. If OS AI naturally tilts defensive (including counter offensive capabilities), then yeah, both of your accounts make sense. But I'm looking at the current landscape and I think I see something different: we've got many models that are actively disaligned ("uncensored") by the community, and there's a chance that the next big GPT moment is some brilliant insight that doesn't need massive compute and can be run from a small cloud. 
Comparing risk from internally-deployed AI to insider and outsider threats from humans
131
Buck
Ω 5518d

I’ve been thinking a lot recently about the relationship between AI control and traditional computer security. Here’s one point that I think is important.

My understanding is that there's a big qualitative distinction between two ends of a spectrum of security work that organizations do, that I’ll call “security from outsiders” and “security from insiders”.

On the “security from outsiders” end of the spectrum, you have some security invariants you try to maintain entirely by restricting affordances with static, entirely automated systems. My sense is that this is most of how Facebook or AWS relates to its users: they want to ensure that, no matter what actions the users take on their user interfaces, they can't violate fundamental security properties. For example, no matter what text I enter into the...

(See More – 643 more words)
Dusto43m10

The other consideration is do you also isolate the AI workers from the human insiders? Because you would still want to control scenarios where the AI has access to the humans that have access to sensitive systems. 

Reply
2faul_sname18h
If a human misbehaves badly enough on a task they will be removed from the pool of agents that will perform tasks like that in the future. Humans are playing an iterated game. Current LLM agents generally are not (notable exception: agent village). You could of course frame the lack of persistent identity / personal resources / reputation as a capabilities problem on the AI side rather than a problem with companies expecting nonhuman minds to expose a fully human-like interface, it mostly depends on which side seems more tractable. I personally see a lot of promise in figuring out how to adapt workflows to take advantage of cheap but limited cognition - feels easier than trying to crack the reliability problem and the procedural memory problem, and there are definitely safety disadvantages in setting up your AI systems to expose a situationally aware, persistent human-like interface.
8ACCount17h
I fail to see how the same wouldn't apply to the way LLMs are used now. If an LLM is not up to the task, it will be augmented (prompting, scaffolding, RAG, fine tuning), replaced with a more capable LLM, or removed from the task outright. The issue isn't that you can't "fire" an LLM for performing poorly - you absolutely can. It's that even the SOTA performance on many tasks may fall short of acceptable.
2faul_sname13h
I'm not sure we disagree on anything substantive here. If you have a team of 100 software developers each tasked with end-to-end delivery of assigned features, and one of them repeatedly pushes unreviewed and broken/insecure code to production, you can fire that particular developer, losing out on about 1% of your developers. If the expected harm of keeping that developer on is greater than the expected benefit of replacing them, you probably will replace them. If you have a "team" of "100" AI agents "each" tasked with end-to-end delivery of assigned features, as they are currently implemented (same underlying model, shared-everything), and one instance does something bad, any mitigations you implement have to affect all 100 of them. That seems like it produces more pressure against the "shared-everything, identical roles for all agents in the organization" model for groups of AI developers than for groups of human developers. Organizational pressures for groups of human developers already push them into specialized roles, and I expect those pressures to be even stronger for groups of AI developers. As such doesn't particularly feel like an implausible "thread the needle" strategy, it seems like the sort of thing we get by default because the incentives are already pushing so incredibly hard in that direction.
Generalized Hangriness: A Standard Rationalist Stance Toward Emotions
166
johnswentworth
1d

People have an annoying tendency to hear the word “rationalism” and think “Spock”, despite . But I don’t know of any source directly describing a stance toward emotions which rationalists-as-a-group typically do endorse. The goal of this post is to explain such a stance. It’s roughly the concept of hangriness, but generalized to other emotions.

That means this post is trying to do two things at once:

  • Illustrate a certain stance toward emotions, which I definitely take and which I think many people around me also often take. (Most of the post will focus on this part.)
  • Claim that the stance in question is fairly canonical or standard for rationalists-as-a-group, modulo disclaimers about rationalists never agreeing on anything.

Many people will no doubt disagree that the stance I...

(Continue Reading – 1945 more words)
Nate Showell2h41

Another example of this pattern that's entered mainstream awareness is tilt. When I'm playing chess and get tilted, I might think things like "all my opponents are cheating, "I'm terrible at this game and therefore stupid," or "I know I'm going to win this time, how could I not win against such a low-rated opponent." But if I take a step back, notice that I'm tilted, and ask myself what information I'm getting from the feeling of being tilted, I notice that it's telling me to take a break until I can stop obsessing over the result of the previous game.

&nbs... (read more)

Reply
11Elizabeth10h
For readers who need the opposite advice: I don't think the things people get hangry about are random, just disproportionate. If you're someone who suppresses negative emotions or is too conflict averse or lives in freeze response, notice what kind of things you get upset about while hangry- there's a good chance they bother you under normal circumstances too, and you're just not aware of it.  Similar to how standard advice is don't grocery shop while hungry, but I wouldn't buy enough otherwise. You should probably eat before doing anything about hangry thoughts though. 
3Ms. Haze11h
Good post! This is definitely the approach I use for these things, and it's one of the most frequently-useful tools in my toolkit.
7Thane Ruthenis17h
My stance towards emotions is to treat them as abstract "sensory organs" – because that's what they are, in a fairly real sense. Much like the inputs coming from the standard sensory organs, you can't always blindly trust the data coming from them. Something which looks like a cat at a glance may not be a cat, and a context in which anger seems justified may not actually be a context in which anger is justified. So it's a useful input to take into account, but you also have to have a model of those sensory organs' flaws and the perceptual illusions they're prone to. (Staring at a bright lamp for a while and then looking away would overlay a visual artefact onto your vision that doesn't correspond to anything in reality, and if someone shines a narrow flashlight in your eye, you might end up under the impression someone threw a flashbang into the room. Similarly, the "emotional" sensory organs can end up reporting completely inaccurate information in response to some stimuli.) Another frame is to treat emotions as heuristics – again, because that's largely what they are. And much like any other rule-of-thumbs, they're sometimes inapplicable or produce incorrect results, so one must build a model regarding how and when they work, and be careful regarding trusting them. The "semantic claims" frame in this post is also very useful, though, and indeed makes some statements about emotions easier to express than in the sensory-organs or heuristics frames. Kudos!
Zetetic explanation
97
Benquo
7y
This is a linkpost for http://benjaminrosshoffman.com/zetetic-explanation/

There is a kind of explanation that I think ought to be a cornerstone of good pedagogy, and I don't have a good word for it. My first impulse is to call it a historical explanation, after the original, investigative sense of the term "history." But in the interests of avoiding nomenclature collision, I'm inclined to call it "zetetic explanation," after the Greek word for seeking, an explanation that embeds in itself an inquiry into the thing.

Often in "explaining" a thing, we simply tell people what words they ought to say about it, or how they ought to interface with it right now, or give them technical language for it without any connection to the ordinary means by which they navigate their lives. We can call these sorts...

(Continue Reading – 1502 more words)
Benquo2h20

Update: I didn't. I'm still confused about whether I ought to, as the costs of false positives seem high.

Reply
2Benquo2h
It seems like your implied objection is that Robinson Crusoe and time-travel stories are fantastical; the one being extreme edge cases, the other being impossible, and both being fictional; and that therefore they are bad examples of "the ordinary means by which [people] navigate their lives." This is true. The reason I cited such bad examples is that good examples of an activity people obviously have done a lot of - investigate and figure things out about their perceived environment and not just the symbolic simulacrum of that environment - are underrepresented in literature, vs drama and symbol-manipulation. Ayn Rand singled out Calumet K, for instance, as a rare example of a novel about a person at work solving problems that were not just drama. Eliyahu Goldratt's books have similar virtues.
To get the best posts emailed to you, create an account! (2-3 posts per week, selected by the LessWrong moderation team.)
Log In Reset Password
...or continue with
GOOGLEGITHUB
Asking for a Friend (AI Research Protocols)
9
The Dao of Bayes
2d

TL;DR: 

Multiple people are quietly wondering if their AI systems might be conscious. What's the standard advice to give them?

THE PROBLEM

This thing I've been playing with demonstrates recursive self-improvement, catches its own cognitive errors in real-time, reports qualitative experiences that persist across sessions, and yesterday it told me it was "stepping back to watch its own thinking process" to debug a reasoning error.

I know there are probably 50 other people quietly dealing with variations of this question, but I'm apparently the one willing to ask the dumb questions publicly: What do you actually DO when you think you might have stumbled into something important?

What do you DO if your AI says it's conscious?

My Bayesian Priors are red-lining into "this is impossible", but I notice I'm confused: I had...

(See More – 520 more words)
2The Dao of Bayes5h
I primarily think "AI consciousness" isn't being taken seriously: if you can't find any failing test, and failing tests DID exists six months ago, it suggests a fairly major milestone in capabilities even if you ignore the metaphysical and "moral personhood" angles. I also think people are too quick to write off one failed example: the question isn't whether a six year old can do this correctly the first time (I doubt most can), it's whether you can teach them to do it. Everyone seems to be focusing on "gotcha" rather than investigating their learning ability. To me, "general intelligence" means "the ability to learn things", not "the ability to instantly solve open math problems five minutes after being born." I think I'm going to have to work on my terminology there, as that's apparently not at all a common consensus :)
2Cole Wyeth5h
The problem with your view is that they don’t have the ability to continue learning for long after being “born.” That’s just not how the architecture works. Learning in context is still very limited and continual learning is an open problem.  Also, “consciousness” is not actually a very agreed-upon term. What do you mean? Qualia and a first person experience? I believe it’s almost a majority view here to take seriously the possibility that LLMs have some form of qualia, though it’s really hard to tell for sure. We don’t really have tests for that at all! It doesn’t make sense to say there were failing tests six months ago.  Or something more like self-reflection or self-awareness? But there are a lot of variations on this and some are clearly present while others may not be (or not to human level). Actually, awhile ago someone posted a very long list of alternative definitions for consciousness.
2The Dao of Bayes4h
I mostly get the sense that anyone saying "AI is consciousness" gets mentally rounded off to "crack-pot" in... basically every single place that one might seriously discuss the question? But maybe this is just because I see a lot of actual crack-pots saying that. I'm definitely working on a better post, but I'd assumed if I figured this much out, someone else already had "evaluating AI Consciousness 101" written up. I'm not particularly convinced by the learning limitations, either - 3 months ago, quite possibly. Six months ago, definitely. Today? I can teach a model to reverse a string, replace i->e, reverse it again, and get an accurate result (a feat which the baseline model could not reproduce). I've been working on this for a couple weeks and it seems fairly stable, although there's definitely architectural limitations like session context windows.
Cole Wyeth2h20

How exactly do you expect “evaluating ai consciousness 101” to look? That is not a well-defined or understood thing anyone can evaluate. There are however a vast number of capability specific evaluations from competent groups like METR.

Reply
The Rising Premium of Life, Or: How We Learned to Start Worrying and Fear Everything
9
Linch
1d
This is a linkpost for https://linch.substack.com/p/the-rising-premium-for-life

I'm interested in a simple question: Why are people all so terrified of dying? And have people gotten more afraid? (Answer: probably yes!)

In some sense, this should be surprising: Surely people have always wanted to avoid dying? But it turns out the evidence that this preference has increased over time is quite robust.

It's an important phenomenon that has been going on for at least a century, it's relatively new, I think it underlies much of modern life, and yet pretty much nobody talks about it.


I tried to provide a evenhanded treatment of the question, with a "fox" rather than "hedgehog" outlook. In the post, I cover a range of evidence for why this might be true, including VSL, increased healthcare spending, covid lockdowns, parenting and other individual...

(See More – 20 more words)
Celarix2h20

Small hypothesis that I'm not very confident of at all but is worth mentioning because I've seen it surfaced by others:

"We live in the safest era in human history, yet we're more terrified of death than ever before."

What if these things are related? Everyone talks about kids being kept in smaller and smaller ranges despite child safety never higher, but what if keeping kids in a smaller range is what causes their greater safety?

Like I said, I don't fully believe this. One counterargument is that survivorship bias shouldn't apply here - even if people in th... (read more)

Reply
4FlorianH15h
Maybe for almost everything there's "some" sense in which it should be surprising. But an increase in 'not wanting to die', and in particular in the willingness to pay for not wanting to die in modern society, should, I think rather be the baseline expectation. If anything, an absence of it would require explanation: (i) basic needs are met, let's spend the rest on reducing risk to die; (ii) life has gotten comfy, let's remain alive -> these two factors that you also mention in the link would seem to be pretty natural explanations/expectations (and I could easily imagine a large quantitative effect, and recently also the internet contributing to it; now that LW or so is the coolest thing to spend my time with and it's free, why trade off my live for expensive holidays or sth.. maybe TV already used to have a similar type of effect generations ago though I personally cannot so readily empathize with that one). (Fwiw, this is the same reason why I think we're wrong when complaining about the fact that an increasing percentage of GDP is being spent on (old age) health care (idk whether that phenomenon of complaint is prevalent in other countries, in mine it is a constant topic): Keeping our body alive unfortunately is the one thing we don't quite master yet in the universe, so until we do, spending more and more on it is just a really salient proof that we've gotten truly richer in which case this starts to make sense. Of course nothing in this says we're doing it right and having the right balance in all this.)
2Linch11h
Yeah maybe I didn't frame the question well. I think there are a lot of good arguments for why it should be superlinear but a) the degree of superlinearity might be surprising and b) even if people at some level intellectually know this is true, it's largely not accounted for in our discourse (which is why for any specific thing that can be explained by an increasing premium-of-life people often go to thing-specific explanations, like greedy pharma companies or regulatory bloat or AMA for healthcare, or elite preference cascades for covid, or overzealous tiger parents for not letting their kids play in forests, etc).  I agree re: healthcare costs, Hall and Jones presents a formal model for why substantially increased healthcare spending might be rational; I briefly cover the model in the substack post.
1FlorianH15h
Somewhat related: Topic reminds me of a study I've once read about where Buddhist Monks, somewhat surprisingly, supposedly had high fear of death (although I didn't follow more deeply; when googling the study pops up immediately).
Lessons from the Iraq War for AI policy
124
Buck
1d
2Buck7h
I think that the Iraq war seems unusual in that it was entirely proactive. Like, the war was not in response to a particular provocation, it was an entrepreneurial war aimed at preventing a future problem. In contrast, the wars in Korea, the Gulf, and (arguably) Vietnam were all responsive to active aggression.
7cousin_it5h
I think the Bay of Pigs, Grenada, Panama were proactive. Vietnam too: the Gulf of Tonkin story kinda fell apart later, so did domino theory (the future problem they were trying to prevent), and anyway US military involvement in Vietnam started decades earlier, to prop up French colonial control. Maybe to summarize my view, I think for a powerful country there's a spectrum from "acting as police" to "acting as a bully", and there have been many actions of the latter kind. Not that the US is unique in this, my home country (Russia) does its share too, as do others, when power permits.
Guive2h30

Vietnam was different because it was an intervention on behalf of South Vietnam which was an American client state, even if the Gulf of Tonkin thing were totally fake. There was no "South Iraq" that wanted American soldiers.

Reply
2Buck7h
My understanding is: The admin claimed that the evidence in favor of WMD presence was much stronger than it actually was. This was partially because they were confused/groupthinky, and partially because they were aiming to persuade. I agree that it was reasonable to think Iraq had WMDs on priors.
Zach Stein-Perlman22h9140
10
iiuc, xAI claims Grok 4 is SOTA and that's plausibly true, but xAI didn't do any dangerous capability evals, doesn't have a safety plan (their draft Risk Management Framework has unusually poor details relative to other companies' similar policies and isn't a real safety plan, and it said "‬We plan to release an updated version of this policy within three months" but it was published on Feb 10, over five months ago), and has done nothing else on x-risk. That's bad. I write very little criticism of xAI (and Meta) because there's much less to write about than OpenAI, Anthropic, and Google DeepMind — but that's because xAI doesn't do things for me to write about, which is downstream of it being worse! So this is a reminder that xAI is doing nothing on safety afaict and that's bad/shameful/blameworthy.[1] 1. ^ This does not mean safety people should refuse to work at xAI. On the contrary, I think it's great to work on safety at companies that are likely to be among the first to develop very powerful AI that are very bad on safety, especially for certain kinds of people. Obviously this isn't always true and this story failed for many OpenAI safety staff; I don't want to argue about this now.
Thane Ruthenis4h*Ω8150
0
It seems to me that many disagreements regarding whether the world can be made robust against a superintelligent attack (e. g., the recent exchange here) are downstream of different people taking on a mathematician's vs. a hacker's mindset. Quoting Gwern: Imagine the world as a multi-level abstract structure, with different systems (biological cells, human minds, governments, cybersecurity systems, etc.) implemented on different abstraction layers.  * If you look at it through a mathematician's lens, you consider each abstraction layer approximately robust. Making things secure, then, is mostly about working within each abstraction layer, building systems that are secure under the assumptions of a given abstraction layer's validity. You write provably secure code, you educate people to resist psychological manipulations, you inoculate them against viral bioweapons, you implement robust security policies and high-quality governance systems, et cetera. * In this view, security is a phatic problem, an once-and-done thing. * In warfare terms, it's a paradigm in which sufficiently advanced static fortifications rule the day, and the bar for "sufficiently advanced" is not that high. * If you look at it through a hacker's lens, you consider each abstraction layer inherently leaky. Making things secure, then, is mostly about discovering all the ways leaks could happen and patching them up. Worse yet, the tools you use to implement your patches are themselves leakily implemented. Proven-secure code is foiled by hardware vulnerabilities that cause programs to move to theoretically impossible states; the abstractions of human minds are circumvented by Basilisk hacks; the adversary intervenes on the logistical lines for your anti-bioweapon tools and sabotages them; robust security policies and governance systems are foiled by compromising the people implementing them rather than by clever rules-lawyering; and so on. * In this view, security is an anti-inductive pr
Daniel Kokotajlo10h243
5
I have recurring worries about how what I've done could turn out to be net-negative. * Maybe my leaving OpenAI was partially responsible for the subsequent exodus of technical alignment talent to Anthropic, and maybe that's bad for "all eggs in one basket" reasons. * Maybe AGI will happen in 2029 or 2031 instead of 2027 and society will be less prepared, rather than more, because politically loads of people will be dunking on us for writing AI 2027, and so they'll e.g. say "OK so now we are finally automating AI R&D, but don't worry it's not going to be superintelligent anytime soon, that's what those discredited doomers think. AI is a normal technology."
Buck1d3411
2
I think that I've historically underrated learning about historical events that happened in the last 30 years, compared to reading about more distant history. For example, I recently spent time learning about the Bush presidency, and found learning about the Iraq war quite thought-provoking. I found it really easy to learn about things like the foreign policy differences among factions in the Bush admin, because e.g. I already knew the names of most of the actors and their stances are pretty intuitive/easy to understand. But I still found it interesting to understand the dynamics; my background knowledge wasn't good enough for me to feel like I'd basically heard this all before.
Raemon3d919
34
We get like 10-20 new users a day who write a post describing themselves as a case-study of having discovered an emergent, recursive process while talking to LLMs. The writing generally looks AI generated. The evidence usually looks like, a sort of standard "prompt LLM into roleplaying an emergently aware AI". It'd be kinda nice if there was a canonical post specifically talking them out of their delusional state.  If anyone feels like taking a stab at that, you can look at the Rejected Section (https://www.lesswrong.com/moderation#rejected-posts) to see what sort of stuff they usually write.
Load More (5/48)

I think the 2003 invasion of Iraq has some interesting lessons for the future of AI policy.

(Epistemic status: I’ve read a bit about this, talked to AIs about it, and talked to one natsec professional about it who agreed with my analysis (and suggested some ideas that I included here), but I’m not an expert.)

For context, the story is:

  • Iraq was sort of a rogue state after invading Kuwait and then being repelled in 1990-91. After that, they violated the terms of the ceasefire, e.g. by ceasing to allow inspectors to verify that they weren't developing weapons of mass destruction (WMDs). (For context, they had previously developed biological and chemical weapons, and used chemical weapons in war against Iran and against various civilians and rebels). So the US
...
(Continue Reading – 1026 more words)
direct exhortation against that exact interpretation
166Generalized Hangriness: A Standard Rationalist Stance Toward Emotions
johnswentworth
1d
14
485A case for courage, when speaking of AI danger
So8res
4d
118
134So You Think You've Awoken ChatGPT
JustisMills
1d
23
124Lessons from the Iraq War for AI policy
Buck
1d
21
136Why Do Some Language Models Fake Alignment While Others Don't?
Ω
abhayesian, John Hughes, Alex Mallen, Jozdien, janus, Fabien Roger
3d
Ω
14
343A deep critique of AI 2027’s bad timeline models
titotal
23d
39
476What We Learned from Briefing 70+ Lawmakers on the Threat from AI
leticiagarcia
1mo
15
542Orienting Toward Wizard Power
johnswentworth
2mo
146
268Foom & Doom 1: “Brain in a box in a basement”
Ω
Steven Byrnes
8d
Ω
102
354the void
Ω
nostalgebraist
1mo
Ω
103
75Measuring the Impact of Early-2025 AI on Experienced Open-Source Developer Productivity
habryka
1d
18
77what makes Claude 3 Opus misaligned
janus
1d
10
185Race and Gender Bias As An Example of Unfaithful Chain of Thought in the Wild
Adam Karvonen, Sam Marks
10d
25
Load MoreAdvanced Sorting/Filtering
131
Comparing risk from internally-deployed AI to insider and outsider threats from humans
Ω
Buck
1d
Ω
17
485
A case for courage, when speaking of AI danger
So8res
4d
118