The Failed Strategy of Artificial Intelligence Doomers

Ben Pace

135 The Failed Strategy of Artificial Intelligence Doomers

31st Jan 2025

6 min read

135

I like reading outsider accounts of things I'm involved in / things I care about. This essay is a serious attempt to look at and critique the big picture of AI x-risk reduction efforts over the last ~decade. While I strongly disagree with many parts of it, I cannot easily recall another outsider essay that's better, so I encourage folks to engage with this critique and also look for any clear improvements to future AI x-risk reduction strategies that this essay suggests.

Here's the opening ~20% of the article, the rest is at the link.

In recent decades, a growing coalition has emerged to oppose the development of artificial intelligence technology, for fear that the imminent development of smarter-than-human machines could doom humanity to extinction. The now-influential form of these ideas began as debates among academics and internet denizens, which eventually took form—especially within the Rationalist and Effective Altruist movements—and grew in intellectual influence over time, along the way collecting legible endorsements from authoritative scientists like Stephen Hawking and Geoffrey Hinton.
Ironically, by spreading the belief that superintelligent AI is achievable and supremely powerful, these “AI Doomers,” as they came to be called, inspired the creation of OpenAI and other leading artificial intelligence labs whose technology they argue will destroy us all. Despite this, they have continued nearly the same advocacy strategy, and are now in the process of persuading Western governments that superintelligent AI is achievable and supremely powerful. To this end, they have created organized and well-funded movements to lobby for regulation, and their members are staffing key positions in the U.S. and British governments.
Their basic argument is that more intelligent beings can outcompete less intelligent beings, just as humans outcompeted mastodons or saber-toothed tigers or neanderthals. Computers are already ahead of humans in some narrow areas, and we are on track to create a superintelligent artificial general intelligence (AGI) which can think as broadly and creatively in any domain as the smartest humans. “Artificial general intelligence” is not a technical term, and is used differently by different groups to mean everything from “an effectively omniscient computer which can act independently, invent unthinkably powerful new technologies, and outwit the combined brainpower of humanity” to “software which can substitute for most white-collar workers” to “chatbots which usually don’t hallucinate.”
AI Doomers are concerned with the former scenario, where computer systems outreason, outcompete, and doom humanity to extinction. The AI Doomers are only one of several factions that oppose AI and seek to cripple it via weaponized regulation. There are also factions concerned about “misinformation” and “algorithmic bias,” which in practice means they think chatbots must be censored to prevent them from saying anything politically inconvenient. Hollywood unions oppose generative AI for the same reason that the longshoremen’s union opposes automating American ports and insists on requiring as much inefficient human labor as possible. Many moralists seek to limit “AI slop” for the same reasons that moralists opposed previous new media like video games, television, comic books, and novels—and I can at least empathize with this last group’s motives, as I wasted much of my teenage years reading indistinguishable novels in exactly the way that 19th century moralists warned against. In any case, the AI Doomers vary in their attitudes towards these factions. Some AI Doomers denounce them as Luddites, some favor alliances of convenience, and many stand in between.
Most members of the “AI Doomer” coalition initially called themselves by the name of “AI safety” advocates. However, this name was soon co-opted by these other factions with concerns smaller than human extinction. The AI Doomer coalition has far more intellectual authority than AI’s other opponents, with the most sophisticated arguments and endorsements from socially-recognized scientific and intellectual elites, so these other coalitions continually try to appropriate and wield the intellectual authority gathered by the AI Doomer coalition. Rather than risk being misunderstood, or fighting a public battle over the name, the AI Doomer coalition abandoned the name “AI safety” and rebranded itself to “AI alignment.” Once again, this name was co-opted by outsiders and abandoned by its original membership. Eliezer Yudkowsky coined the term “AI Notkilleveryoneism” in an attempt to establish a name that could not be co-opted, but unsurprisingly it failed to catch on among those it was intended to describe.
Today, the coalition’s members do not agree on any name for themselves. “AI Doomers,” the only widely understood name for them, was coined by their rhetorical opponents and is considered somewhat offensive by many of those it refers to, although some have adopted it themselves for lack of a better alternative. While I regret being rude, this essay will refer to them as “AI Doomers” in the absence of any other clear, short name.
Whatever name they go by, the AI Doomers believe the day computers take over is not far off, perhaps as soon as three to five years from now, and probably not longer than a few decades. When it happens, the superintelligence will achieve whatever goals have been programmed into it. If those goals are aligned exactly to human values, then it can build a flourishing world beyond our most optimistic hopes. But such goal alignment does not happen by default, and will be extremely difficult to achieve, if its creators even bother to try. If the computer’s goals are unaligned, as is far more likely, then it will eliminate humanity in the course of remaking the world as its programming demands. This is a rough sketch, and the argument is described more fully in works like Eliezer Yudkowsky’s essays and Nick Bostrom’s Superintelligence.
This argument relies on several premises: that superintelligent artificial general intelligence is philosophically possible, and practical to build; that a superintelligence would be more or less all-powerful from a mere human perspective; that superintelligence would be “unfriendly” to humanity by default; that superintelligence can be “aligned” to human values by a very difficult engineering program; that superintelligence can be built by current research and development methods; and that recent chatbot-style AI technologies are a major step forward on the path to superintelligence. Whether those premises are true has been debated extensively, and I don’t have anything useful to add to that discussion which I haven’t said before. My own opinion is that these various premises range from “pretty likely but not proven” to “very unlikely but not disproven.”
Even assuming all of this, the political strategy of the AI Doomer coalition is hopelessly confused and cannot possibly work. They seek to establish onerous regulations on for-profit AI companies in order to slow down AI research—or forcibly halt research entirely, euphemized as “Pause AI,” although most of the coalition sees the latter policy as desirable but impractical to achieve. They imagine that slowing or halting development will necessarily lead to “prioritizing a lot of care over moving at maximal speed” and wiser decisions about technology being made. This is false, and frankly very silly, and it’s always left extremely vague because the proponents of this view cannot articulate any mechanism or reason why going slower would result in more “care” and better decisions, with the sole exception of Yudkowsky’s plan to wait indefinitely for unrelated breakthroughs in human intelligence enhancement.
But more immediately than that, if AI Doomer lobbyists and activists like the Center for AI Safety, the Institute for AI Policy and Strategy, Americans for Responsible Innovation, Palisade Research, the Safe AI Forum, Pause AI, and many similar organizations succeed in convincing the U.S. government that AI is the key to the future of all humanity and is too dangerous to be left to private companies, the U.S. government will not simply regulate AI to a halt. Instead, the U.S. government will do what it has done every time it’s been convinced of the importance of a powerful new technology in the past hundred years: it will drive research and development for military purposes. This is the same mistake the AI Doomers made a decade ago, when they convinced software entrepreneurs that AI is the key to the future and so inspired them to make the greatest breakthroughs in AI of my lifetime. The AI Doomers make these mistakes because their worldview includes many assumptions, sometimes articulated and sometimes tacit, which don’t hold up to scrutiny.

Continue reading here.

Criticisms of The Rationalist MovementExistential riskPostmortems & RetrospectivesSocial & Cultural DynamicsAI

Frontpage

135

Mentioned in

52≤10-year Timelines Remain Unlikely Despite DeepSeek and o3

11Where Would Good Forecasts Most Help AI Governance Efforts?

The Failed Strategy of Artificial Intelligence Doomers

7the gears to ascension

New Comment

78 comments, sorted by

top scoring

Click to highlight new comments since: Today at 9:50 AM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]jimrandomh5mo7949

The article seems to assume that the primary motivation for wanting to slow down AI is to buy time for institutional progress. Which seems incorrect as an interpretation of the motivation. Most people that I hear talk about buying time are talking about buying time for technical progress in alignment. Technical progress, unlike institution-building, tends to be cumulative at all timescales, which makes it much more strategically relevant.

[-]Ben Pace5mo263

For what it's worth, I have grown pessimistic about our ability to solve the open technical problems even given 100 years of work on them. I think it possible but not probable in most plausible scenarios.

Correspondingly the importance I assign to increasing the intelligence of humans has drastically increased.

[-]tailcalled5mo159

Correspondingly the importance I assign to increasing the intelligence of humans has drastically increased.

I feel like human intelligence enhancement would increase capabilities development faster than alignment development, maybe unless you've got a lot of discrimination in favor of only increasing the intelligence of those involved with alignment.

3Leopard5mo

Maybe if they all have IQ 200+, they automatically realize that and rather work on alignment than on capabilities? Or come up with a pivotal act. With Eliezer going [public](https://x.com/tsarnick/status/1882927003508359242) with the IQ enhancement motion he at least must think so? (because if done publicly it'll initiate intelligence enhancement race between US, China and other countries; and that'd normally lead to AI capabilities speed-run unless the amplified people are automatically wiser than that)

1M. Y. Zuo5mo

Well as the first few pararagphs of the text suggests, the median ‘AI Safety’ advocate over time has been barely sentient, relative to other motivated groups, when it comes to preventing certain labels from being co-opted by those groups…. so it seems unlikely they will become so many standard deviations above average in some other aspect at any point in the future. Because the baseline will also change in the future.

[-]aysja5mo120

I have grown pessimistic about our ability to solve the open technical problems even given 100 years of work on them.

Why?

[-]Ben Pace5mo161

I'm not particularly resolute on this question. But I get this sense when I look at (a) the best agent foundations work that's happened over ~10 years of work on the matter, and (b) the work output of scaling up the number of people working on 'alignment' by ~100x.

For the first, trying to get a better understand of the basic concepts like logical induction and corrigibility and low-impact and ontological updates, while I feel like there's been progress (in timeless decision theory taking a clear step forward in figuring out how think about decision-makers as algorithms; in logical induction as moving forward on how to think about logical uncertainty; notably in the Embedded Agency sequence outlining many basic confusions; and in various writings like Radical Probabilism and Geometric Rationality in finding the breaking edges of expected utility maximization) I don't feel like the work done over the last 10 years is on track to be a clear ~10% of the work needed.

I'm not confident it makes sense to try to count it linearly. But I don't know that there's enough edges here or new results to feel good about, given 10x as much time to think about it, a new paradigm / set of concept... (read more)

[-]Lucius Bushnaq5mo*4017

I don't share the feeling that not enough of relevance has happened over the last ten years for us to seem on track for solving it in a hundred years, if the world's technology^[1] were magically frozen in time.

Some more insights from the past ten years that look to me like they're plausibly nascent steps in building up a science of intelligence and maybe later, alignment:

We understood some of the basics of general pattern matching: How it is possible for embedded minds that can't be running actual Solomonoff induction to still have some ability to extrapolate from old data to new data. This used to be a big open problem in embedded agency, at least to me, and I think it is largely solved now. Admittedly a lot of the core work here actually happened more than ten years ago, but people in ml or our community didn't know about it. [1,2]
Natural latents. [1,2,3]
Some basic observations and theories about the internal structure of the algorithms neural networks learn, and how they learn them. Yes, our networks may be a very small corner of mind space, but one example is way better than no examples! There's a lot on this one, so the following is just a very small and biased selection.

... (read more)

9Davidmanheim5mo

I hate to be insulting to a group of people I like and respect, but "the best agent foundations work that's happened over ~10 years of work" was done by a very small group of people who, despite being very smart, certainly smarter than myself, aren't academic superstars or geniuses (Edit to add: on a level that is arguably sufficient, as I laid out in my response below.) And you agree about this. The fact that they managed to make significant progress is fantastic, but substantial progress on deep technical problems is typically due to (ETA: only-few-in-a-generation level) geniuses, large groups of researchers tackling the problem, or usually both. And yes, most work on the topic won't actually address the key problem, just like most work in academia does little or nothing to advance the field. But progress happens anyways, because intentionally or accidentally, progress on problems is often cumulative, and as long as a few people understand the problem that matters, someone usually actually notices when a serious advance occurs. I am not saying that more people working on the progress and more attention would definitely crack the problems in the field this decade, but I certainly am saying that humanity as a whole hasn't managed even what I'd consider a half-assed semi-serious attempt.

8TsviBT5mo

IDK if this is relevant to much, but anyway, given the public record, saying that Scott Garrabrant isn't a genius is just incorrect. Sam Eisenstat is also a genius. Also Jessica Taylor I think. (Pace other members of AF such as myself.)

7Davidmanheim5mo

Apologies - when I said genius, I had a very high bar in mind, no more than a half dozen people alive today, who each have single-handedly created or materially advanced an entire field. And I certainly hold Scott in very high esteem, and while I don't know Sam or Jessica personally, I expect they are within throwing distance - but I don't think any of them meet this insanely high bar. And Scott's views on this, at least from ca. 2015, was a large part of what informed my thinking about this; I can't tell the difference between him and Terry Tao when speaking with them, but he can, and he said there is clearly a qualitative difference there. Similarly for other people clearly above my league, including a friend who worked with Thurston at Cornell back in 2003-5. (It's very plausible that Scott Aaronson is in this bucket as well, albeit in a different areas, though I can't tell personally, and have not heard people say this directly - but he's not actually working on the key problems, and per him, he hasn't really tried to work on agent foundations. Unfortunately.) So to be clear, I think Scott is a genius, but not one of the level that is needed to single-handedly advance the field to the point where the problem might be solved this decade, if it is solvable. Yes, he's brilliant, and yes, he has unarguably done a large amount of the most valuable work in the area in the past decade, albeit mostly more foundational that what is needed to solve the problem. So if we had another dozen people of his caliber at each of a dozen universities working on this, that would be at least similar in magnitude to what we have seen in fields that have made significant progress in a decade - though even then, not all fields like hat see progress. But the Tao / Thurston level of genius, usually in addition to the above-mentioned 100+ top people working on the problem, is what has given us rapid progress in the past in fields where such progress was possible. This may not be one of t

6TsviBT5mo

Oh ok lol. Ok on a quick read I didn't see too much in this comment to disagree with. (One possible point of disagreement is that I think you plausibly couldn't gather any set of people alive today and solve the technical problem; plausibly you need many, like many hundreds, of people you call geniuses. Obviously "hundreds" is made up, but I mean to say that the problem, "come to understand minds--the most subtle/complex thing ever--at a pretty deep+comprehensive level", is IMO extremely difficult, like it's harder than anything humanity has done so far by a lot, not just an ordinary big science project. Possibly contra Soares, IDK.) (Another disagreement would be I don't actually think logical induction is that valuable for the AGI alignment problem, to the point where random philosophy is on par in terms of value to alignment, though I expect most people to disagree with this. It's just a genius technical insight in general.)

2Davidmanheim5mo

I admitted that it's possible the problem is practically unsolvable, or worse; you could have put the entire world on Russell and Whitehead's goal of systematizing math, and you might have gotten to Gödel faster, but you'd probably just waste more time. And on Scott's contributions, I think they are solving or contributing towards solving parts of the problems that were posited initially as critical to alignment, and I haven't seen anyone do more. (With the possible exception of Paul Christiano, who hasn't been focusing on research for solving alignment as much recently.) I agree that the work doesn't don't do much other than establish better foundations, but that's kind-of the point. (And it's not just Logical induction - there's his collaboration on Embedded Agency, and his work on finite factored sets.) But the fact that the work done to establish the base for the work is more philosophical and doesn't align AGI seems like it is moving the goalposts, even if I agree it's true.

6Eli Tyre5mo

Can you operationalize the standard you're using for "genius" here? Do you mean "IQ > 150"?

7TsviBT5mo

Of course not. I mean, any reasonable standard? Garrabrant induction, bro. "Produces deep novel (ETA: important difficult) insight"

2Eli Tyre5mo

Have the others you listed produced insights on that level? What did you observe that leads you to call them geniuses, "by any reasonable standard"?

4TsviBT5mo

Jessica I'm less sure about. Sam, from large quantities of insights in many conversations. If you want something more legible, I'm what, >300 ELO points better than you at math; Sam's >150 ELO points better than me at math if I'm trained up, now probably more like >250 or something. Not by David's standard though, lol.

2Mateusz Bagiński5mo

Sam: https://www.lesswrong.com/posts/CvKnhXTu9BPcdKE4W/an-untrollable-mathematician-illustrated

[-]Seth Herd5mo107

I feel a bit sad that the alignment community is so focused on intelligence enhancement. The chance of getting enough time for that seems so low that it's accepting a low chance of survival.

What has convinced you that the technical problems are unsolvable? I've been trying to track the arguments on both sides rather closely, and the discussion just seems unfinished. My shortform on cruxes of disagreement on alignment difficulty still is mostly my current summary of the state of disagreements.

It seems like we have very little idea how technically difficult alignment will be. The simplicia/doomimir debates sum up the logic very nicely, but the distribution of expert opinions seems more telling: people who think about alignment don't know to what extent techniques for aligning LLMs will generalize to transformative AI, AGI, or ASI.

There's a lot of pessimism about the people and organizations that will likely be in charge of building and aligning our first AGIs. I share this pessimism. But it seems quite plausible to me that those people and orgs will take the whole thing slightly more seriously by the time we get there, and actual technical alignment will turn out to be easy eno... (read more)

3Raphael Roche5mo

Don't you think that articles like "Alignment Faking in Large Language Models" by Anthropic show that models can internalize the values present in their training data very deeply, to the point of deploying various strategies to defend them, in a way that is truly similar to that of a highly moral human? After all, many humans would be capable of working for a pro-animal welfare company and then switching to the opposite without questioning it too much, as long as they are paid. Granted, this does not solve the problem of an AI trained on data embedding undesirable values, which we could then lose control over. But at the very least, isn't it a staggering breakthrough to have found a way to instill values into a machine so deeply and in a way similar to how humans acquire them? Not long ago, this might have seemed like pure science fiction and utterly impossible. There are still many challenges regarding AI safety, but isn't it somewhat extreme to be more pessimistic about the issue today than in the past? I read Superintelligence by Bostrom when it was released, and I must say I was more pessimistic after reading it than I am today, even though I remain concerned. But I am not an expert in the field—perhaps my perspective is naïve.

8RHollerith5mo

Eliezer thinks (as do I) that technical progress in alignment is hopeless without first improving the pool of prospective human alignment researchers (e.g., via human cognitive augmentation).

7aysja5mo

Technical progress also has the advantage of being the sort of thing which could make a superintelligence safe, whereas I expect very little of this to come from institutional competency alone.

6aphyer5mo

Buying time for technical progress in alignment...to be made where, and by who?

[-]jimrandomh5mo113

Any of the many nonprofits, academic research groups, or alignment teams within AI labs. You don't have to bet on a specific research group to decide that it's worth betting on the ecosystem as a whole.

There's also a sizeable contingent that thinks none of the current work is promising, and that therefore buying a little time is value mainly insofar as it opens the possibility of buying a lot of time. Under this perspective, that still bottoms out in technical research progress eventually, even if, in the most pessimistic case, that progress has to route through future researchers who are cognitively enhanced.

4Vaniver5mo

I think you need both? That is--I think you need both technical progress in alignment, and agreements and surveillance and enforcement such that people don't accidentally (or deliberately) create rogue AIs that cause lots of problems. I think historically many people imagined "we'll make a generally intelligent system and ask it to figure out a way to defend the Earth" in a way that I think seems less plausible to me now. It seems more like we need to have systems in place already playing defense, which ramp up faster than the systems playing offense.

2Roman Leventov5mo

https://gradual-disempowerment.ai/ is mostly about institutional progress, not narrow technical progress.

[-]habryka5mo*7262

I feel like intelligence enhancement being pretty solidly in the near-term technological horizon provides strong argument for future governance being much better. There are also maybe 3-5 other technologies that seem likely to be achieved in the next 30 years bar AGI that would all hugely improve future AGI governance.

And then a lot of the post seems to make really quite bad arguments against forecasting AI timelines and other technologies, doing so with... I really don't know, a rejection of bayesianism? A random invocation of an asymmetric burden of proof? If anyone learned anything useful from its section on timelines or technological forecasting, please tell me, since it really is among the worst things I have heard Ben Landau Taylor write, who I respect a lot. The stuff as written really makes no sense. I am personally on the longer end of timelines, but none of my reasoning looks anything like that.

Seriously, what are the technological forecasts in this essay:

While there is no firm ground for any prediction as to how long it will take before any technological breakthrough [to substantial intelligence enhancement], if ever, it seems more likely that such a regime w

... (read more)

4Vaniver5mo

I think the position Ben (the author) has on timelines is really not that different from Eliezer's; consider pieces like this one, which is not just about the perils of biological anchors. I think the piece spends less time than I would like on what to do in a position of uncertainty--like, if the core problem is that we are approaching a cliff of uncertain distance, how should we proceed?--but I think it's not particularly asymmetric. [And--there's something I like about realism in plans? If people are putting heroic efforts into a plan that Will Not Work, I am on the side of the person on the sidelines trying to save them their effort, or direct them towards a plan that has a chance of working. If the core uncertainty is whether or not we can get human intelligence advancement in 25 years--I'm on your side of thinking it's plausible--then it seems worth diverting what attention we can from other things towards making that happen, and being loud about doing that.]

[-]Nora_Ammann5mo6317

I found this article ~very poor. Much of the rhetorical moves adopted in the pieces seem largely optimised for making it easy to stay on the "high horse". Talking about a singular AI doomer movement being one of them. Having the stance that AGI is not near and thus there is nothing to worry about is another. Whether or not that's true, it certainly makes it easy to point your finger at folks who are worried and say 'look what silly theater'.

I think it's somewhat interesting to ask whether there should be more coherence across safety efforts, and at the margins, the answer might be yes. But I'm also confused about the social model that suggests that there could be something like a singular safety plan (instead, I think we live in a world where increasingly more people are waking up to the implications of AI progress, and of course there will be diverse and to some extent non-coherent reactions to this), OR that a singular coherent safety plan would be desirable given the complexity and amount of uncertainty invovled in the challenge.

[-]ozziegooen5mo2410

I like reading outsider accounts of things I'm involved in / things I care about.

Just for context for some not aware - The author, Ben Landau-Taylor, has been in the rationalist-extended community for some time now. This post is written on Palladium Magazine, which I believe basically is part of Samo Burja's setup. I think both used to be around Leverage Research and some other rationality/EA orgs.

Ben and Samo have been working on behalf of Palladium and similar for a while now.

My quick read is that this article is analogous to similar takes they've written about/discussed before, which is not too much of a surprise.

I disagree with a lot of their intuitions, but at the same time, I'm happy to have more voices discuss some of these topics.

All this to say, while these people aren't exactly part of the scene now, they're much closer to it than what many might imagine as "outsider accounts."

5Ben_LandauTaylor4mo

Yes. I was properly a member of LW from 2012 to ~2016, and did a lot of community organizing at the time. I’m still friends with a bunch of people here. I could only write this historical account because I saw it develop firsthand and actively participated in its early stages.

[-]Vaniver5mo*158

Instead, the U.S. government will do what it has done every time it’s been convinced of the importance of a powerful new technology in the past hundred years: it will drive research and development for military purposes.

I think this is my biggest disagreement with the piece. I think this is the belief I most wish 10-years-ago-us didn't have, so that we would try something else, which might have worked better than what we got.

Or--in shopping the message around to Silicon Valley types, thinking more about the ways that Silicon Valley is the child of the US military-industrial complex, and will overestimate their ability to control what they create (or lack of desire to!). Like, I think many more 'smart nerds' than military-types believe that human replacement is good.

1JesperO4mo

Why do you believe that US government/military would not be convinced to invest more in AGI/ASI development from being convinced of the potential power in AI?

2Vaniver4mo

The short version is they're more used to adversarial thinking and security mindset, and don't have a culture of "fake it until you make it" or "move fast and break things". I don't think it's obvious that it goes that way, but I think it's not obvious that it goes the other way.

[-]StefanHex5mo1513

I’ve just read the article, and found it indeed very thought provoking, and I will be thinking more about it in the days to come.

One thing though I kept thinking: Why doesn’t the article mention AI Safety research much?

In the passage

The only policy that AI Doomers mostly agree on is that AI development should be slowed down somehow, in order to “buy time.”

I was thinking: surely most people would agree on policies like “Do more research into AI alignment” / “Spend more money on AI Notkilleveryoneism research”?

In general the article frames the policy to “buy time” as to wait for more competent governments or humans, while I find it plausible that progress in AI alignment research could outweigh that effect.

—

I suppose the article is primarily concerned with AGI and ASI, and in that matter I see much less research progress than in more prosaic fields.

That being said, I believe that research into questions like “When do Chatbots scheme?”, “Do models have internal goals?”, “How can we understand the computation inside a neural network?” will make us less likely to die in the next decades.

Then, current rationalist / EA policy goals (including but lot limited to pauses and slow downs of capabilities research) could have a positive impact via the “do more (selective) research” path as well.

6Davidmanheim5mo

Because almost all of current AI safety research can't make future agentic ASI that isn't already aligned with human values safe, as everyone who has looked at the problem seems to agree. And the Doomers certainly have been clear about this, even as most of the funding goes to prosaic alignment.

[-]Knight Lee5mo127

I think the government can speed up alignment more than the government can speed up capabilities, assuming it starts to care much more about both. Why?

AI safety spending is only $0.1 billion while AI capabilities spending is $200 billion. AI safety spending can easily increase by many orders of magnitude, but AI capabilities spending cannot since it already rivals the US military budget.

Also, would you still agree with the "Statement on AI Inconsistency," or disagree with it too?

Statement on AI Inconsistency (v1.0us):
1: ASI threatens the US (and NATO) as much as all military threats combined. Why does the US spend $800 billion/year on its military but less than $0.1 billion/year on AI alignment/safety?
2: ASI being equally dangerous isn't an extreme opinion: the median superforecaster sees a 2.1% chance of an AI catastrophe (killing 1 in 10 people), the median AI expert sees 5%-12%, other experts see 5%, and the general public sees 5%. To justify 8000 times less spending, you must be 99.999% sure of no AI catastrophe, and thus 99.95% sure that you won't realize you were wrong and the majority of experts were right (if you studied the disagreement further).
3: “But military spending i

... (read more)

[-]kman5mo*103

I think I mostly agree with the critique of "pause and do what, exactly?", and appreciate that he acknowledged Yudkowsky as having a concrete plan here. I have many gripes, though.

Whatever name they go by, the AI Doomers believe the day computers take over is not far off, perhaps as soon as three to five years from now, and probably not longer than a few decades. When it happens, the superintelligence will achieve whatever goals have been programmed into it. If those goals are aligned exactly to human values, then it can build a flourishing world beyond our most optimistic hopes. But such goal alignment does not happen by default, and will be extremely difficult to achieve, if its creators even bother to try. If the computer’s goals are unaligned, as is far more likely, then it will eliminate humanity in the course of remaking the world as its programming demands. This is a rough sketch, and the argument is described more fully in works like Eliezer Yudkowsky’s essays and Nick Bostrom’s Superintelligence.
This argument relies on several premises: that superintelligent artificial general intelligence is philosophically possible, and practical to build; that a superintelligence would

... (read more)

[-]samuelshadrach5mo105

One of my opinions on this stuff is that Yudkowsky does not understand politics at all very deep level, and Yudkowskys writings are one the of the main attractors in this space, so lesswrong systematically attracts people who are bad at understanding politics (but may be good at some STEM subject).

[-]Lukas_Gloor5mo2810

[Edit: I wrote my whole reply thinking that you were talking about "organizational politics." Skimming the OP again, I realize you probably meant politics politics. :) Anyway, I guess I'm leaving this up because it also touches on the track record question.]

I thought Eliezer was quite prescient on some of this stuff. For instance, I remember this 2017 dialogue (so less than 2y after OpenAI was founded), which on the surface talks about drones, but if you read the whole post, it's clear that it's meant as an analogy to building AGI:

AMBER:  The thing is, I am a little worried that the head of the project, Mr. Topaz, isn’t concerned enough about the possibility of somebody fooling the drones into giving out money when they shouldn’t. I mean, I’ve tried to raise that concern, but he says that of course we’re not going to program the drones to give out money to just anyone. Can you maybe give him a few tips? For when it comes time to start thinking about security, I mean.
CORAL:  Oh. Oh, my dear, sweet summer child, I’m sorry. There’s nothing I can do for you.
AMBER:  Huh? But you haven’t even looked at our beautiful business model!
CORAL:  I t

... (read more)

5samuelshadrach5mo

I agree Yudkowsky is not incompetent at understanding politics. I’m saying he’s not exceptionally good at it. Basically, he’s average. Just like you and me (until proven otherwise). I didn’t read the entire post, I only skimmed it, but my understanding is this post is Yudkowsky yet again claiming alignment is difficult and that there are some secret insights inside Yudkowsky’s head as to why alignment is hard that can’t be shared in public. I remember reading Yudkowsky versus Christiano debates some years back and they had this same theme of inexplicable insights inside Yudkowkys head. The reasoning about politics in the post you just linked mostly assumes there exist some inexplicable but true insights about alignment difficulty inside Yudkowskys head.

1flandry395mo

I really liked your quote and remarks. So much so, that I made an edited version of them as a new post here: http://mflb.com/ai_alignment_1/d_250207_insufficient_paranoia_gld.html

6Ben Pace5mo

Can I double-click on what "does not understand politics at [a] very deep level" means? Can someone explain what they have in mind? I think Eliezer has probably better models than most of what our political institutions are capable of, and probably isn't very skilled at personally politicking. I'm not sure what other people have in mind.

3samuelshadrach5mo

Sorry for delay in reply. I’m not sure if the two are separable. Let’s say you believe in “great man” theory of history (I.e. few people disproportionately shape history, and not institutions, market forces etc). Then your ability to predict what other great men could do automatically means you may have some of the powers of a great man yourself. Also yes I mean he isn’t exceptionally skilled at either of the two. My bet is there are people who can make significantly better predictions than him, if only they also understood technical details of AI.

[-]Liron5mo86

This article is just saying "doomers are failing to prevent doom for various reasons, and also they might be wrong that doom is coming soon". But we're probably not wrong, and not being doomers isn't a better strategy. So it's a lame article IMO.

[-]Mitchell_Porter5mo83

I can't bring myself to read it properly. The author has an ax to grind, he wants interplanetary civilization and technological progress for humanity, and it's inconvenient to that vision if progress in one form of technology (AI) has the natural consequence of replacing humanity, or at the very least removing it from the driver's seat. So he simply declares "There is No Reason to Think Superintelligence is Coming Soon", and the one doomer strategy he does approve of - the enhancement of human biological intelligence - happens to be one that once again inv... (read more)

[-]the gears to ascension5mo74

He appears to be arguing against a thing, while simultaneously criticizing people; but I appreciate that he seems to do it in ways that are not purely negative, also mentioning times things have gone relatively well (specifically, updating on evidence that folks here aren't uniquely correct), even if it's not enough to make the rest of his points not a criticism.

I entirely agree with his criticism of the strategy he's criticizing. I do think there are more obviously tenable approaches than the "just build it yourself lol" approach or "just don't let anyone... (read more)

[-]romeostevensit5mo71

Notkilleveryonism, why not Omnicidal AI? As in we oppose OAI.

2Milan W5mo

Yes, "we are against Omnicidal AI" is better marketing than "we are for AI Notkilleveryoneism".

[-]Erich_Grunewald5mo61

I think this article far overstates the extent to which these AI policy orgs (maybe with the exception of MIRI? but I don’t think so) are working towards an AI pause, or see the goal of policy/regulation as slowing AI development. (I mean policy orgs, not advocacy orgs.) I see as much more common policy objectives: creating transparency around AI development, directing R&D towards safety research, laying groundwork for international agreements, slowing Chinese AI development, etc. — things that (is the hope) are useful on their own, not because of any effect on timelines.

[-]Nathan Helm-Burger5mo63

But more immediately than that, if AI Doomer lobbyists and activists ... succeed in convincing the U.S. government that AI is the key to the future of all humanity and is too dangerous to be left to private companies, the U.S. government will not simply regulate AI to a halt. Instead, the U.S. government will do what it has done every time it’s been convinced of the importance of a powerful new technology in the past hundred years: it will drive research and development for military purposes

I said exactly this in the comments on Max Tegmark's post...

"If... (read more)

[-]yams5mo50

Question for Ben:

Are you inviting us to engage with the object level argument, or are you drawing attention to the existence of this argument from a not-obviously-unreasonable-source as a phenomenon we are responsible for (and asking us to update on that basis)?

On my read, he’s not saying anything new (concerns around military application are why ‘we’ mostly didn’t start going to the government until ~2-3 years ago), but that he’s saying it, while knowing enough to paint a reasonable-even-to-me picture of How This Thing Is Going, is the real tragedy.

2Ben Pace5mo

The former, but the latter is a valid response too. Someone doing a good job of painting an overall picture is a good opportunity to reflect on the overall picture and what changes to make, or what counter-arguments to present to this account.

[-]Ben_LandauTaylor4mo40

I've responded at length to the top comments here back on my blog.

[-]WillPetillo5mo41

Before jumping into critique, the good:
- Kudos to Ben Pace for seeking out and actively engaging with contrary viewpoints
- The outline of the x-risk argument and history of the AI safety movement seem generally factually accurate

The author of the article makes quite a few claims about the details of PauseAI's proposal, its political implications, the motivations of its members and leaders...all without actually joining the public Discord server, participating in the open Q&A new member welcome meetings (I know this because I host them), or even showing... (read more)

[-]Ben Pace5mo*4-2

I wrote that this "is the best sociological account of the AI x-risk reduction efforts of the last ~decade that I've seen." The line has some disagree reacts inline; I expect this is primarily an expression that the disagree-ers have a low quality assessment of the article, but I would be curious to see links to any other articles or posts that attempt something similar to this one, in order to compare whether they do better/worse/different. I actually can't easily think of any (which is why I felt it was not that bold to say this was the best).

Edit: I've expanded the opening paragraph, to not confuse my comment for me agreeing with the object level assessment of the article..

2Ben Pace5mo

FWIW in my mind I was comparing this to things like Glen Weyl's Why I Am Not a Technocrat, and thought this was much better. (Related: Scott Alexander's response, Weyl's counter-response).

[-]Remmelt5mo30

Of the recent wave of AI companies, the earliest one, DeepMind, relied on the Rationalists for its early funding. The first investor, Peter Thiel, was a donor to Eliezer Yudkowsky’s Singularity Institute for Artificial Intelligence (SIAI, but now MIRI, the Machine Intelligence Research Institute) who met DeepMind’s founder at an SIAI event. Jaan Tallinn, the most important Rationalist donor, was also a critical early investor…
…In 2017, the Open Philanthropy Project directed $30 million to OpenAI…

Good overview of how through AI Safety funders ended up... (read more)

[-]Maloew5mo20

Instead, the U.S. government will do what it has done every time it’s been convinced of the importance of a powerful new technology in the past hundred years: it will drive research and development for military purposes.

I wonder if there is an actual path to alignment-pilling the US government by framing it as a race to solve alignment? That would get them to make military projects focused on aligning AI as quickly as possible, rather than building a hostile god. It also seems like a fairly defensible position politically, with everything being a struggle ... (read more)

[-]tangerine5mo21

This is indeed an interesting sociological breakdown of the “movement”, for lack of a better word.

I think the injection of the author’s beliefs about whether or not short timelines are correct distracting from the central point. For example, the author states the following.

there is no good argument for when [AGI] might be built.

This is a bad argument against worrying about short timelines, bordering on intellectual dishonesty. Building anti-asteroid defenses is a good idea even if you don’t know that one is going to hit us within the next year.

The argument... (read more)

[-]JesperO4mo10

Ben just posted a reply to comments on his Palladium article, including the comments here on LessWrong.

[-]flandry395mo10

There are a lot of issues with the article cited above. Due to the need for more specific text formatting, I wrote up my notes, comments, and objections here:

http://mflb.com/ai_alignment_1/d_250206_asi_policies_gld.html

[-]bhauth5mo10

Yes, this is part of why I didn't post AI stuff in the past, and instead just tried to connect with people privately. I might not have accomplished much, but at least I didn't help OpenAI happen or shift the public perception of AI safety towards "fedora-wearing overweight neckbeards".

[-]Cole Wyeth5mo1-1

He’s right that arguments for short timelines are essentially vibes-based but he completely ignores the value of technical A.I. safety research, which is pretty much the central justification for our case.

5Noosphere895mo

I think the arguments for short timelines are definitely weaker than their proponents usually assume, but they aren't totally vibes based, and while not so probable as to dominate the probability mass, are probable enough to be action guiding: https://www.lesswrong.com/posts/LCNdGLGpq89oRQBih/bayesianism-for-humans-probable-enough I do predict that we will probably have at least 1 more paradigm shift before the endgame, but I'm not so confident in it as to dismiss simple scaling.

[-]TsviBT5mo104

the arguments for short timelines are definitely weaker than their proponents usually assume, but they aren't totally vibes based

Each person with short timelines can repeat sentences that were generated by a legitimate reason to expect short timelines, but many of them did not generate any of those sentences themselves as the result of trying to figure out when AGI would come; their repeating those sentences is downstream of their timelines. In that sense, for many such people, short timelines actually are totally vibes based.

4Eli Tyre5mo

I dispute this characterization. It's normal and appropriate for people's views to update in response to the arguments produced by others. Sure, sometimes people most parrot other people's views, without either developing them independently or even doing evaluatory checks to see if those views seem correct. But most of the time, I think people are doing those checks? Speaking for myself, most of my views on timelines are downstream of ideas that I didn't generate myself. But I did think about those ideas, and evaluate if they seemed true.

[-]TsviBT5mo100

I think people are doing those checks?

No. You can tell because they can't have an interesting conversation about it, because they don't have surrounding mental content (such as analyses of examples that stand up to interrogation, or open questions, or cruxes that aren't stupid). (This is in contrast to several people who can have an interesting conversation about, even if I think they're wrong and making mistakes and so on.)

But I did think about those ideas, and evaluate if they seemed true.

Of course I can't tell from this sentence, but I'm pretty skeptical both of you in particular and of other people in the broad reference class, that most of them have done this in a manner that really does greatly attenuate the dangers of deference.

2Eli Tyre5mo

(I endorse personal call outs like this one.) Why? Forecasting the future is hard, and I expect surprises that deviate from my model of how things will go. But o1 and o3 seem like pretty blatant evidence that reduced my uncertainty a lot. On pretty simple heuristics, it looks like earth now knows how to make a science and engineering superintelligence: by scaling reasoning modes in a self-play-ish regime. I would take a bet with you about what we expect to see in the next 5 years. But more than that, what kind of epistemology do you think I should be doing that I'm not?

5Nick_Tarleton5mo

To be more object-level than Tsvi: o1/o3/R1/R1-Zero seem to me like evidence that "scaling reasoning models in a self-play-ish regime" can reach superhuman performance on some class of tasks, with properties like {short horizons, cheap objective verifiability, at most shallow conceptual innovation needed} or maybe some subset thereof. This is important! But, for reasons similar to this part of Tsvi's post, it's a lot less apparent to me that it can get to superintelligence at all science and engineering tasks.

2TsviBT5mo

I can't tell what you mean by much of this (e.g. idk what you mean by "pretty simple heuristics" or "science + engineering SI" or "self-play-ish regime"). (Not especially asking you to elaborate.) Most of my thoughts are here, including the comments: https://www.lesswrong.com/posts/sTDfraZab47KiRMmT/views-on-when-agi-comes-and-on-strategy-to-reduce Not really into formal betting, but what are a couple Pareto[impressive, you're confident we'll see within 5 years] things? Come on, you know. Actually doubt, and then think it through. I mean, I don't know. Maybe you really did truly doubt a bunch. Maybe you could argue me from 5% omnicide in next ten years to 50%. Go ahead. I'm speaking from informed priors and impressions.

2Rafael Harth5mo

I think this is true but also that "most people's reasons for believing X are vibes-based" is true for almost any X that is not trivially verifiable. And also that this way of forming beliefs works reasonably well in many cases. This doesn't contradict anything you're saying but feels worth adding, like I don't think AI timelines are an unusual topic in that regard.

[-]TsviBT5mo193

Broadly true, I think.

almost any X that is not trivially verifiable

I'd probably quibble a lot with this.

E.g. there are many activities that many people engage in frequently--eating, walking around, reading, etc etc. Knowledge and skill related to those activities is usually not vibes-based, or only half vibes-based, or something, even if not trivially verifiable. For example, after a few times accidentally growing mold on some wet clothes or under a sink, very many people learn not to leave areas wet.

E.g. anyone who studies math seriously must learn to verify many very non-trivial things themselves. (There will also be many things they will believe partly based on vibes.)

I don't think AI timelines are an unusual topic in that regard.

In that regard, technically, yes, but it's not very comparable. It's unusual in that it's a crucial question that affects very many people's decisions. (IIRC, EVERY SINGLE ONE of the >5 EA / LW / X-derisking adjacent funder people that I've talked to about human intelligence enhancement says "eh, doesn't matter, timelines short".) And it's in an especially uncertain field, where consensus should much less strongly be expected to be correct. A... (read more)

4Noosphere895mo

I definitely agree with this, but in their defense, this is to be expected, especially in fast growing fields. Model building is hard, and specialization generally beats trying to deeply understand something in general, so it's not that surprising that many people won't understand why, and this will be the case regardless of the truth value of timelines claims.

[-]Siebe5mo1-4

The AI Doomers are only one of several factions that oppose AI and seek to cripple it via weaponized regulation.

Bad faith

There are also factions concerned about “misinformation” and “algorithmic bias,” which in practice means they think chatbots must be censored to prevent them from saying anything politically inconvenient.

Bad faith

AI Doomer coalition abandoned the name “AI safety” and rebranded itself to “AI alignment.”

Seems wrong

3habryka5mo

(Why do you believe this? I think this is a reasonable gloss of what happened around 2015-2016. I was part of many of those conversations, as I was also part of many of the conversations in which me and others gave up on "AI Alignment" as a thing that could meaningfully describe efforts around existential risk reduction)

1Siebe5mo

Seems to me the name AI safety is currently still widely used, no? As it covers much more than just alignment strategies, by including also stuff like control and governance

2habryka5mo

That's a pretty recent thing! Agree that it has become more used recently (in the last 1-2 years) for practicalreasons.

Moderation Log

Curated and popular this week