By A. Nobody

 

When I first posted on LessWrong, I expected some pushback. That’s normal. If you’re arguing that AGI will lead to human extinction and that capitalism makes this outcome inevitable, you’re going to meet resistance. But what I didn’t expect -and what ultimately led me to write this - is the way that resistance has manifested.

From the very beginning, my essays were met with immediate hostility, not on the basis of their logic or premises, but because of vague accusations of them being “political.” This came directly from site admins. And crucially, this wasn’t after reading the content. It was before. The mere idea that someone might be drawing a line from capitalism to extinction was enough to trigger rejection - not intellectual rebuttal, just rejection.

My main essay - arguably the core of the entire argument I’m developing - has been heavily downvoted. Not because it was proven wrong, or because someone pointed out a fatal flaw. But because people didn’t like that the argument existed. There has still not been a single substantive refutation of any of my key premises. Not one. The votes tell you it’s nonsense, but no one is able to explain why.

This isn’t a community failing to find holes in the logic. It’s a community refusing to engage with it at all.

And this mirrors what I’ve seen more broadly. The resistance I’ve received from academia and the AI safety community has been no better. I’ve had emails ignored, responses that amount to “this didn’t come from the right person,” and the occasional reply like this one, from a very prominent member of AI safety:

“Without reading the paper, and just going on your brief description…”

That’s the level of seriousness these ideas are treated with.

Imagine for a moment that an amateur astronomer spots an asteroid on a trajectory to wipe out humanity. He doesn’t have a PhD. He’s not affiliated with NASA. But the evidence is there. And when he contacts the people whose job it is to monitor the skies, they say: “Who are you to discover this?” And then refuse to even look in the direction he’s pointing.

That’s what this is. And it’s not an exaggeration.

I understand institutional resistance. I get that organisations - whether they’re companies, universities, or online communities - don’t like outsiders coming in and telling them they’ve missed something. But this is supposed to be a place that values rational thought. Where ideas live or die based on their reasoning, not on who said them.

Instead, it’s felt like posting to Reddit. The same knee-jerk downvotes. The same smug hand-waving. The same discomfort that someone has written something you don’t like but can’t quite refute.

LessWrong has long had a reputation for being unwelcoming to people who aren’t “in.” I now understand exactly what that means. I came here with ideas. Not dogma, not politics. Just ideas. You don’t have to agree with them. But the way they’ve been received proves something important - not about me, but about the site.

So this will be my last post. I’ll leave the essays up for anyone who wants to read them in the future. I’m not deleting anything. I stand by all of it. And if you’ve made it this far, and actually read what I’ve written rather than reacting to the premise of it, thank you. That’s all I ever wanted - good faith engagement.

The rest of you can go back to not looking up.

- A. Nobody

New Comment
33 comments, sorted by Click to highlight new comments since:

I've skimmed™ what I assume is your "main essay". Thoughtless Kneejerk Reaction™ follows:

  • You are preaching to the choir. Most of it are 101-level arguments in favor of AGI risk. Basically everyone on LW has already heard them, and either agrees vehemently, or disagrees with some subtler point/assumption which your entry-level arguments don't cover. The target audience for this isn't LWers, this is not content that's novel and useful for LWers. That may or may not be grounds for downvoting it (depending on one's downvote philosophy), but is certainly grounds for not upvoting it and for not engaging with it.
    • The entry-level arguments have been reiterated here over and over and over and over again, and it's almost never useful, and everyone's sick of them, and you essay didn't signal that engaging with you on them would be somehow unusually productive.
    • If I am wrong, prove me wrong: quote whatever argument of yours you think ranks the highest on novelty and importance, and I'll evaluate it.
  • The focus on capitalism likely contributed to the "this is a shallow low-insight take" impression. The problem isn't "capitalism", it's myopic competitive dynamics/Moloch in general. Capitalism exhibits lots of them, yes. But a bunch of socialist/communist states would fall into the same failure mode; a communist world government would fall into the same failure mode (inasmuch as it would still involve e. g. competition between researchers/leaders for government-assigned resources and prestige). Pure focus on capitalism creates the impression that you're primarily an anti-capitalism ideologue who's aiming co-opt the AGI risk for that purpose.
    • A useful take along those lines might be to argue that we can tap into the general public's discontent with capitalism to more persuasively argue the case for the AGI risk, followed by an analysis regarding specific argument structures which would be both highly convincing and truthful.
  • Appending an LLM output at the end, as if it's of inherent value, likely did you no favors.

I'm getting the impression that you did not familiarize yourself with LW's culture and stances prior to posting. If yes, this is at the root of the problems you ran into.

Edit:

Imagine for a moment that an amateur astronomer spots an asteroid on a trajectory to wipe out humanity. He doesn’t have a PhD. He’s not affiliated with NASA. But the evidence is there. And when he contacts the people whose job it is to monitor the skies, they say: “Who are you to discover this?” And then refuse to even look in the direction he’s pointing.

A more accurate analogy would involve the amateur astronomer joining a conference for people discussing how to divert that asteroid, giving a presentation where he argues for the asteroid's existence using low-resolution photos and hand-made calculations (to a room full of people who've observed the asteroid through the largest international telescopes or programmed supercomputer simulations of its trajectory), and is then confused why it's not very well-received.

Appreciate the thoughtful reply - even if it’s branded as a “thoughtless kneejerk reaction.”

I disagree with your framing that this is just 101-level AGI risk content. The central argument is not that AGI is dangerous. It’s that alignment is structurally impossible under competitive pressure, and that capitalism - while not morally to blame - is simply the most extreme and efficient version of that dynamic.

Most AGI risk discussions stop at “alignment is hard.” I go further: alignment will be optimised away, because any system that isn’t optimising as hard as possible won’t survive the race. That’s not an “entry-level” argument - it’s an uncomfortable one. If you know where this specific line of reasoning has been laid out before, I’d genuinely like to see it. So far, people just say “we’ve heard this before” and fail to cite anything. It’s happened so many times I’ve lost count. Feel free to be the first to buck the trend and link someone making this exact argument, clearly, before I did.

I’m also not “focusing purely on capitalism.” The essay explicitly states that competitive structures - whether between nations, labs, or ideologies - would lead to the same result. Capitalism just accelerates the collapse. That’s not ideological; that’s structural analysis.

The suggestion that I should have reframed this as a way to “tap into anti-capitalist sentiment” misses the point entirely. I’m not trying to sell a message. I’m explaining why we’re already doomed. That distinction matters.

As for the asteroid analogy: your rewrite is clever, but wrong. You assume the people in the room already understand the trajectory. My entire point is that they don’t. They’re still discussing mitigation strategies while refusing to accept that the cause of the asteroid's trajectory is unchangeable. And the fact that no one can directly refute that logic - only call it “entry-level” or “unhelpful” - kind of proves the point.

So yes, you did skim my essay - with the predictable result. You repeated what many others have already said, without identifying any actual flaws, and misinterpreted as much of it as possible along the way.

alignment is structurally impossible under competitive pressur

Alignment contrasts with control, as a means to AI safety.

Alignment roughly means the AI has goals, or values similar to human ones (which are assumed, without much evidence to be similar across humans), so that it will do what we want , because it's what it wants.

Control means that it doesn't matter what the AI wants, if it wants anything.

In short, there is plenty of competitive pressure towards control , because no wants an AI they can't control. Control is part of capability.

alignment will be optimised away, because any system that isn’t optimising as hard as possible won’t survive the race

Off the top of my head, this post. More generally, this is an obvious feature of AI arms races in the presence of alignment tax. Here's a 2011 writeup that lays it out:

Given abundant time and centralized careful efforts to ensure safety, it seems very probable that these risks could be avoided: development paths that seemed to pose a high risk of catastrophe could be relinquished in favor of safer ones. However, the context of an arms race might not permit such caution. A risk of accidental AI disaster would threaten all of humanity, while the benefits of being first to develop AI would be concentrated, creating a collective action problem insofar as tradeoffs between speed and safety existed.

I assure you the AI Safety/Alignment field has been widely aware of it since at least that long ago.

Also,

alignment will be optimised away, because any system that isn’t optimising as hard as possible won’t survive the race

Any (human) system that is optimizing as hard as possible also won't survive the race. Which hints at what the actual problem is: it's not even that we're in an AI arms race, it's that we're in an AI suicide race which the people racing incorrectly believe to be an AI arms race. Convincing people of the true nature of what's happening is therefore a way to dissolve the race dynamic. Arms races are correct strategies to pursue under certain conditions; suicide races aren't.

I appreciate the links, genuinely - this is the first time someone’s actually tried to point to prior sources rather than vaguely referencing them. It's literally the best reply and attempt at a counter I've received to date, so thanks again. I mean that.

That said, I’ve read all three, and none of them quite say what I’m saying. They touch on it, but none follow the logic all the way through. That’s precisely the gap I’m identifying. Even with the links you've so thoughtfully given, I remain alone in my conclusion. 

They all acknowledge that competitive dynamics make alignment harder. That alignment taxes create pressure to cut corners. That arms races incentivise risky behaviour.

But none of them go as far as I do. They stop at "this is dangerous and likely to go wrong." I’m saying alignment is structurally impossible under competitive pressure. That the systems that try to align will be outcompeted by systems that don’t, and so alignment will not just be hard, but will be optimised away by default. There’s a categorical difference between “difficult and failure-prone” and “unachievable in principle due to structural incentives.”

From the 2011 writeup:

Given abundant time and centralized careful efforts to ensure safety, it seems very probable that these risks could be avoided

No. They can't. That's my point. As long as we continue developing AI it's only a matter of time. There is no long term safe way to develop it. Competitive agents will not choose to in order to beat the competition, and when the AI becomes intelligent enough it will simply bypass any barriers we put in place - alignment or whatever else we design - and go about acting optimally. The AGI safety community is trying to tell the rest of the world, that we must be cautious, but for just long enough to design a puzzle that a beyond human understanding level of intelligence cannot solve, then use that puzzle as a cage for said intelligence. Us, with our limited intellect, will create a puzzle that something far beyond us has no solution for. And they're doing it with a straight face.

I’ve been very careful not to make my claims lightly. I’m aware that the AI safety community has discussed alignment tax, arms races, multipolar scenarios, and so on. But I’ve yet to see someone follow that logic all the way through to where it leads without flinching. That’s the part I believe I’m contributing.

Your point at the end—about it being a “suicide race” rather than an arms race—is interesting. But I’d argue that calling it a suicide race doesn’t dissolve the dynamic. It reframes it, but it doesn’t remove the incentives. Everyone still wants to win. Everyone still optimises. Whether they’re mistaken or not, the incentives remain intact. And the outcome doesn’t change just because we give it a better name.

Competitive agents will not choose to in order to beat the competition

Competitive agents will chose to commit suicide, knowing it's suicide, to beat the competition? That suggests that we should observe CEOs mass-poisoning their employees, Jonestown-style, in a galaxy-brained attempt to maximize shareholder value. How come that doesn't happen?

Are you quite sure the underlying issue here is not that the competitive agents don't believe the suicide raise to be a suicide race?

This is a mischaracterisation of the argument. I’m not saying competitive agents knowingly choose extinction. I’m saying the structure of the race incentivises behaviour that leads to extinction, even if no one intends it.

CEOs aren’t mass-poisoning their employees because that would damage their short and long-term competitiveness. But racing to build AGI - cutting corners on alignment, accelerating deployment, offloading responsibility - improves short-term competitiveness, even if it leads to long-term catastrophe. That’s the difference.

And what makes this worse is that even the AGI safety field refuses to frame it in those terms. They don’t call it suicide. They call it difficult. They treat alignment like a hard puzzle to be solved - not a structurally impossible task under competitive pressure.

So yes, I agree with your last sentence. The agents don’t believe it’s a suicide race. But that doesn’t counter my point - it proves it. We’re heading toward extinction not because we want to die, but because the system rewards speed over caution, power over wisdom. And the people who know best still can’t bring themselves to say it plainly.

This is exactly the kind of sleight-of-hand rebuttal that keeps people from engaging with the actual structure of the argument. You’ve reframed it into something absurd, knocked down the strawman, and accidentally reaffirmed the core idea in the process.

I think you probably don't have the right model of what motivated the reception. "AGI will lead to human extinction and will be built because of capitalism" seems to me like a pretty mainstream position on LessWrong. In fact I strongly suspect this is exactly what Eliezer Yudkowsky believes. The extinction part has been well-articulated, and the capitalism part is what I would have assumed is the unspoken background assumption. Like, yeah, if we didn't have a capitalist system, then the entire point about profit motives, pride, and race dynamics wouldn't apply. So... yeah, I don't think this idea is very controversial on LW (reddit is a different story).

I think the reason that your posts got rejected is that the focus doesn't seem useful. Getting rid of capitalism isn't tractable, so what is gained by focusing on this part of the causal chain? I think that's the part your missing. And because this site is very anti-[political content], you need a very good reason to focus on politics. So I'd guess that what happened is that people saw the argument, thought it was political and not-useful, and consequently downvoted.

if we didn't have a capitalist system, then the entire point about profit motives, pride, and race dynamics wouldn't apply

Presence of many nations without a central authority still contributes to race dynamics.

Yeah, valid correction.

Exactly. That’s the point I’ve been making - this isn’t about capitalism as an ideology, it’s about competition. Capitalism is just the most efficient competitive structure we’ve developed, so it accelerates the outcome. But any decentralised system with multiple actors racing for advantage - whether nation-states or corporations - will ultimately produce the same incentives. That’s the core of the argument.

My idea is not mainstream, although I’ve heard that claim a few times. But whenever I ask people to show me where this argument - that AGI extinction is structurally inevitable due to capitalist competition - has been laid out before, no one can point to anything. What I get instead is vague hand-waving and references to ideas that aren’t what I’m arguing.

Most people say capitalism makes alignment harder. I’m saying it makes alignment structurally impossible. That’s a different claim. And as far as I can tell, a novel one.

If people downvoted because they thought the argument wasn’t useful, fine - but then why did no one say that? Why not critique the focus or offer a counter? What actually happened was silence, followed by downvotes. That’s not rational filtering. That’s emotional rejection.

And if you had read the essay, you’d know it isn’t political. I don’t blame capitalism in a moral sense. I describe a system, and then I show the consequences that follow from its incentives. Socialism or communism could’ve built AGI too - just probably slower. The point isn’t to attack capitalism. It’s to explain how a system optimised for competition inevitably builds the thing that kills us.

So if I understand you correctly: you didn’t read the essay, and you’re explaining that other people who also didn’t read the essay dismissed it as “political” because they didn’t read it.

Yes. That’s exactly my point. Thank you.

https://slatestarcodex.com/2014/07/30/meditations-on-moloch/

It's "mainstream" here, described well many times before.

Meditations on Moloch is an excellent piece - but it’s not the argument I’m making.

Scott describes how competition leads to suboptimal outcomes, yes. But he stops at describing the problem. He doesn’t draw the specific conclusion that AGI alignment is structurally impossible because any attempt to slow down or “align” will be outcompeted by systems that don’t bother. He also doesn’t apply that conclusion to the AGI race with the same blunt finality I do: this ends in extinction, and it cannot be stopped.

So unless you can point to the section where Scott actually follows the AGI race dynamics to the conclusion that alignment will be systematically optimised away - rather than just made “more difficult” - then no, that essay doesn’t make my argument. It covers part of the background context. That’s not the same thing.

This kind of reply - “here’s a famous link that kind of gestures in the direction of what you’re talking about” - is exactly the vague dismissal I’ve been calling out. If my argument really has been made before, someone should be able to point to where it’s clearly laid out.

So far, no one has. The sidestepping and lack of direct engagement in my arguments in this comment section alone has to be studied.

If people downvoted because they thought the argument wasn’t useful, fine - but then why did no one say that? Why not critique the focus or offer a counter? What actually happened was silence, followed by downvotes. That’s not rational filtering. That’s emotional rejection.

Yeah, I do not endorse the reaction. The situation pattern-matches to other cases where someone new writes things that are so confusing and all over the place that making them ditch the community (which is often the result of excessive downvoting) is arguably a good thing. But I don't think this was the case here. Your essays look to me to be coherent (and also probably correct). I hadn't seen any of them before this post but I wouldn't have downvoted. My model is that most people are not super strategic about this kind of thing and just go "talking politics -> bad" without really thinking through whether demotivating the author is good in this case.

So if I understand you correctly: you didn’t read the essay, and you’re explaining that other people who also didn’t read the essay dismissed it as “political” because they didn’t read it.

Yes -- from looking at it, it seems like it's something I agree with (or if not, disagree for reasons that I'm almost certain won't be addressed in the text), so I didn't see a reason to read. I mean reading is a time investment, you have to give me a reason to invest that time, that's how it works. But I thought the (lack of) reaction was unjustified, so I wanted to give you a better model of what happened, which also doesn't take too much time.

Most people say capitalism makes alignment harder. I’m saying it makes alignment structurally impossible.

The point isn’t to attack capitalism. It’s to explain how a system optimised for competition inevitably builds the thing that kills us.

I mean that's all fine, but those are nuances which only become relevant after people read, so it doesn't really change the dynamic I've outlined. You have to give people a reason to read first, and then put more nuances into the text. Idk if this helps but I've learned this lesson the hard way by spending a ridiculous amount of time on a huge post that was almost entirely ignored (this was several years ago).

(It seems like you got some reactions now fwiw, hope this may make you reconsider leaving.)

I appreciate your response, and I'm sorry about the downvotes you got from seeming supportive.

I take your point about getting people to read, but I guess the issue is that the only way you can reliably do that is by being an accepted/popular member of the community. And, as a new member, that would be impossible for me. This would be fine on a high school cheerleading forum, but it seems out of place on a forum that claims to value ideas and reason.

I will still be leaving, but, as a result of this post, I actually have one more post to make. A final final post. And it will not be popular but it will be eye opening. Due to my karma score I can't post it until next Monday, so keep an eye out for it if you're interested.

First, a few criticisms which I feel are valid:

1: Your posts are quite long.

2: You use AI in your posts, but AIs aren't able to produce high enough quality that it's worth posting.

3: Some of your ideas have already been discovered before and have a name on here. "Moloch" for instance is the personification of bad nash equilibriums in game theory. It generally annoys people if you don't make yourself familiar with the background information of the community before posting, but it's a lot of work to do so.

Your conclusion is correct, but it boils down to very little: "greedy local optimization can destroy society". People who already know that likely don't want to read 30 pages which makes the same point. "Capitalism" was likely the closest word you knew, but there's many better words, and you sadly have to be a bit of a nerd to know a lot of useful words.

Here's where I think you're right:

This is not a individualist website for classic nerds with autism who are interested in niche topics, it's a social and  collectivism community for intellectual elites who care about social status and profits.

Objective truth is not of the highest value.
Users care about their image and reputation.
Users care about how things are interpreted (and not just what's written).
Users are afraid of controversies. A blunt but correct answer might net you less karma than a wrong answer which shows good-will.
Users value form - how good of a writer you are will influence the karma, regardless of how correct or valuable your idea is. Verbal intelligence is valued more than other forms.
The userbase has a left-wing bias, and as does the internet (as if about 8 years ago), so you can find lots of sources which argue in favor of things which are just objectively not true. But it's often difficult to find a source which disproves the thing, as they're burried. Finally, as a social website, people value authority and reputation/prestige, and it's likely that the websites they feel are "trustworthy" only include those controlled by left-wing elites.
Users value knowledge more than they value intelligence. They also value experience, but only when some public institution approves of it. They care if you have a PhD, they don't care if you have researched something for 5 years in your own free time.
 

You're feeling the consequences of both. I think most of the negative reaction comes from my first 3 points, and that the way it manifests is a result of the social dynamics.

They care if you have a PhD, they don’t care if you have researched something for 5 years in your own free time.

I don't think this is right. If anything, the median lw user would be more likely to trust a random blogger who researched a topic on their own for 5 years vs a PhD, assuming the blogger is good at presenting their ideas in a persuasive manner.

I'm afraid "Good at presenting their ideas in a persuasive manner" is doing all the heavy lifting here.

If the community had a good impression of him, they'd value his research over that of a PhD. If the community had a bad impression of him, they'd not give a second of thought towards his "research" and they would refer to it with the same mocking quotation marks that I just used. However, in the latter case, they'd find it more difficult to dismiss his PhD.

In other words, the interpretation depends if the community likes you or not. I've been in other rationalist communities and I'm speaking from experience (if I'm less vague that this, I'd be recognizable, which I don't want to be). I saw all the negative social dynamics that you'd find on Reddit or in young female friend groups with a lot of "drama" going on, in case you're unfortunate enough to have an intuition for such a thing.

In any "normie" community there's the staff in charge, and a large number of regular users who are somewhat above the law, and who feel superior to new users (and can bully them all they want, as they're friends with the staff). The treatment of users users depend on how well they fit in culturally, and it requires that they act as if the regulars are special (otherwise their ego is hurt). Of course, some of these effects are borderline invisible on this website, so they're either well-hidden or kept in check.

Still, this is not a truth-maximizing website, the social dynamics and their false premises (e.g. the belief that popularity is a measure of quality) are just too strong. The sort of intllectuals who don't care about social norms, status or money are better at truth-seeking and generally received poorly by places like this.

Still, this is not a truth-maximizing website

I mean, I agree with this, but popularity has a better correlation with truth here compared with any other website -- or more broadly, social group -- that I know of. And actually, I think it's probably not possible for a relatively open venue like this to be perfectly truth-seeking. To go further in that direction, I think you ultimately need some sort of institutional design to explicitly reward accuracy, like prediction markets. But the ways in which LW differs from pure truth-and-importance-seeking don't strike me as entirely bad things either -- posts which are inspiring or funny get upvoted more, for instance. I think it would be difficult to nucleate a community focused on truth-seeking without "emotional energy" of this sort.

I don't think it's possible without changing the people into weird types who really don't care too much about the social aspects of life because they're so interested in the topics at hand. You can try rewarding truth, but people still stumble into issues regarding morality, popularity of ideas, the overton window, some political group that they dislike randomly hitting upon the truth so that they look like supporters for stating the same thing, etc.

I think prediction markets are an interesting concept, but it cannot be taken much further than it is now, since the predictions could start influencing the outcomes. It's dangerous to add rewards to the outcomes of predictions, for when enough money is involved, one can influence the outcome.

The way humans in general differ from truth-seeking agents makes their performance downright horrible on some specific areas (if the truth is not in the overton window for instance). These inaccuracies can cascade and cause problems elsewhere, since they cause incorrect worldviews even in somewhat intelligent people like Musk. There's also a lot of information which is simply getting deleted from the internet, and you can't "weight both sides of the argument" if half the argument is only visible on the waybackmachine or archive.md.

I guess it's important to create a good atmosphere and that everyone is having fun theorizing and such, but some of the topics we're discussing are actually serious. The well-being of millions of people depend on the sort of answers and perspectives which float around puvlic discourse, and I find it pathetic that ideas are immediately shut down if they're not worded correctly or if they touch a growing list of socially forbidden hypotheses.

Finally, these alternative rewards have completely destroyed almost all voting systems on the internet. There's almost no website left on which the karma/thumb/upvote/like count bears any resemblence to post quality anymore. Instead, it's a linear combination of superstimuli like 'relatability', 'novelty', 'feeling of importance (e.g. bad news, danger)', 'cuteness', 'escapism', 'sexual fantasy', 'romantic fantasy', 'boo outgroup', 'irony/parody/parody of parody/self-parady/nihilism', 'nostalgia', 'stupidity' (I'm told it's a kind of humor if you're stupid on purpose, but I think "irony" is a defence mechanism against social judgement). It's like a view into the unfulfilled needs of the population. Youtube view count and subscriptions, Reddit karma, Twitter retweets, all almost gamed to the point that they're useless metrics. Online review sites are going in the same direction. It's like interacting with a group of mentally ill people who decide what you're paid each day. I think it's dangerous to upvote comments based on vibes as it takes very little to corrupt these metrics, and it's hard to notice if upvotes gradually come to represent "dopamine released by reading" or something other than quality/truthfulness.

In theory, I’d agree with you. That’s how lesswrong presents itself: truth-seeking above credentials. But in practice, that’s not what I’ve experienced. And that’s not just my experience, it’s also what LW has a reputation for. I don’t take reputations at face value, but lived experience tends to bring them into sharp focus.

If someone without status writes something long, unfamiliar, or culturally out-of-sync with LW norms - even if the logic is sound - it gets downvoted or dismissed as “political,” “entry-level,” or “not useful.” Meanwhile, posts by established names or well-known insiders get far more patience and engagement, even when the content overlaps.

You say a self-taught blogger would be trusted if they’re good at presenting ideas persuasively. But that’s exactly the issue - truth is supposed to matter more than form. Ideas stand on their own merit, not on an appeal to authority. And yet persuasion, style, tone, and in-group fluency still dominate the reception. That’s not rationalism. That’s social filtering.

So while I appreciate the ideal, I think it’s important to distinguish it from the reality. The gap between the two is part of what my post is addressing.

[+][comment deleted]20

I appreciate the reply, and the genuine attempt to engage. Allow me to respond.

My essays are long, yes. And I understand the cultural value LW places on prior background knowledge, local jargon, and brevity. I deliberately chose not to write in that style—not to signal disrespect, but because I’m writing for clarity and broad accessibility, not for prestige or karma.

On the AI front: I use it to edit and include a short dialogue at the end for interest. But the depth and structure of the argument is mine. I am the author. If the essays were shallow AI summaries of existing takes, they’d be easy to dismantle. And yet, no one has. That alone should raise questions.

As for Moloch - this has come up a few times already in this thread, and I’ve answered it, but I’ll repeat the key part here:

Meditations on Moloch is an excellent piece - but it’s not the argument I’m making.

Scott describes how competition leads to suboptimal outcomes. But he doesn’t follow that logic all the way to the conclusion that alignment is structurally impossible, and that AGI will inevitably be built in a way that leads to extinction. That’s the difference. I’m not just saying it’s difficult or dangerous. I’m saying it’s guaranteed.

And that’s where I think your summary - “greedy local optimization can destroy society” - misses the mark. That’s what most others are saying. I’m saying it will wipe us out. Not “can,” not “might.” Will. And I lay out why, step by step, from first premise to final consequence. If that argument already exists elsewhere, I’ve asked many times for someone to show me where. No one has.

That said, I really do appreciate your comment. You’re one of the few people in this thread who didn’t reflexively defend the group, but instead acknowledged the social filtering mechanisms at play. You’ve essentially confirmed what I argued: the issue isn’t just content - it’s cultural fit, form, and signalling. And that’s exactly why I wrote the post in the first place.

But you’ve still done what almost everyone else here has done: you didn’t read, and you didn’t understand. And in that gap of understanding lies the very thing I’m trying to show you. It's not just being missed - it’s being systematically avoided.

A lot of the words we use are mathematical and thus more precise and with less connotations that people can misunderstand. This forum has a lot of people with STEM degrees, so they use a lot of tech terms, but such vocab is very useful for talking about AI risk. The more precise language is used, the less misunderstandings can occur.

Moloch describes a game theory problem, and these problems generally seem impossible to solve. But even though they're not possible to solve mathematically doesn't mean that we're doomed (I've posted about this on here before but I don't think anyone understood me. In short, game theory problems only play out when certain conditions are met, and we can prevent these conditions from becoming true).

I haven't read all your posts from end to end but I do agree with your conclusions that alignment is impossible and that AGI will result in the death or replacement of humanity. I also think your conclusions are valid only for LLMs which happen to be trained on human data. Since humans are deceptive, it makes sense that AIs training on them are as well. Since humans don't want to die, it makes sense that AIs trained on them also don't want to die. I find it unlikely that the first AGI we get is a LLM since I expect it to be impossible LLMs to improve much further than this.

I will have to disagree that your post is rigorous. You've proven that human errors bad enough to end society *could* occur, not not that they *will* occur. Some of your examples have many years between them because these events are infrequent. I think "There will be a small risk of extinction every year, and eventually we will lose the dice throw" is more correct.
Your essay *feels* like it's outlining tendencies in the direction of extinction, showing transitions which look like the following:
A is like B
A has a tendency for B
For at least some A, B follows.
If A, then B occurs with nonzero probability.
If A, then we cannot prove (not B).
If A, then eventually B.

And that if you collect all of these things in to directed acyclic graph, that there's a *path* from our current position to an extinction event. I don't think you've proven that each step A->B will be taken, and that it's impossible with a probability of 1 to prevent it (even if it's impossible to prevent it with a possibility of 1, which is a different statement)
I admit that my summary was imperfect. Though, if you really believe that it *will* happen, why are you writing this post? There would be no point in warning other people if it was necessarily too late to do anything about it. If you think "It will happen, unless we do X", I'd be interested in hearing what this X is.

I appreciate your reply - it’s one of the more thoughtful responses I’ve received, and I genuinely value the engagement.

Your comment about game theory conditions actually answers the final question in your reply. I don’t state the answer explicitly in my essays (though I do in my book, right at the end), because I want the reader to arrive at it themselves. There seems to be only one conclusion, and I believe it becomes clear if the premises are accepted.

As for your critique - “You’ve shown that extinction could occur, not that it will” - this is a common objection, but I think it misses something important. Given enough time, “could” collapses into “will.” I’m not claiming deductive certainty like a mathematical proof. I’m claiming structural inevitability under competitive pressure. It’s like watching a skyscraper being built on sand. You don’t need to know the exact wind speed or which day it will fall. You just need to understand that, structurally, it’s going to.

If you believe I’m wrong, then the way to show that is not to say “maybe you’re wrong.” Maybe I am. Maybe I'm a brain in a vat. But the way to show that I'm wrong is to draw a different, more probable conclusion from the same premises. That hasn’t happened. I’ve laid out my reasoning step by step. If there’s a point where you think I’ve turned left instead of right, say so. But until then, vague objections don’t carry weight. They acknowledge the path exists, but refuse to admit we’re on it.

You describe my argument as outlining a path to extinction. I’m arguing that all other paths collapse under pressure. That’s the difference. It’s not just plausible. It’s the dominant trajectory - one that will be selected for again and again.

And if that’s even likely, let alone inevitable, then why are we still building? Why are we gambling on alignment like it’s just another technical hurdle? If you accept even a 10% chance that I’m right, then continued development is madness.

As for your last question—if I really believe it’s too late, why am I here?

Read this, although just the end section, "The End: A Discussion with AI" the final paragraph, just before ChatGPT's response.

https://forum.effectivealtruism.org/posts/Z7rTNCuingErNSED4/the-psychological-barrier-to-accepting-agi-induced-human

That's why I'm here - I'm kicking my feet.

My previous criticism was aimed at another post of yours, it likely wasn't your main thesis. Some nitpicks I have with it are:

"Developing AGI responsibly requires massive safeguards that reduce performance, making AI less competitive" you could use the same argument for AIs which are "politically correct", but we still choose to take this step, censorsing AIs and harming their performance, thus, it's not impossible for us to make such choices as long as the social pressure is sufficiently high.

"The most reckless companies will outperform the most responsible ones" True in some ways, but most large companies are not all that reckless at all, which is why we are seeing many sequels, remakes, and clones in the entertainment sector. It's also important to note that these incentives have been true for all of human nature, but that they've never mainfested very strongly until recent times. This suggests that that the antidote to Moloch is humanity itself, good faith, good taste and morality, and that these can beat game theoritical problem which are impossible when human beings are purely rational (i.e. inhuman). 

We're also assuming that AI becomes useful enough for us to disregard safety, i.e. that AI provides a lot of potential power. So far, this has not been true. AIs do not beat humans, companies are forcing LLMs into products but users did not ask for them. LLMs seem impressive at first, but after you get past the surface you realize that they're somewhat incompetent. Governments won't be playing around with human lives before these AIs provide large enough advantages.

"The moment an AGI can self-improve, it will begin optimizing its own intelligence."
This assumption is interesting, what does "intelligence" mean here? Many seems to just give these LLMS more knowledge and then call them more intelligent, but intelligence and knowledge are different things. Most "improvements" seem to lead to higher efficiency, but that's just them being dumb faster or for cheaper. That said, self-improving intelligence is a dangerous concept.

I have many small objections like this to different parts of the essay, and they do add up, or at least add additional paths to how this could unfold.

I don't think AIs will destroy humanity anytime soon (say, within 40 years). I do think that human extinction is possible, but I think it will be due to other things (like the low birthrate and its economic consequences. Also tech. Tech destroys the world for the same reasons that AIs do, it's just slower).

I think it's best to enjoy the years we have left instead of becoming depressed. I see a lot of people like you torturing themselves with x-risk problems (some people have killed themselves over Roko's basilisk as well). Why not spend time with friends and loved ones?

Extra note: There's no need to tie your identity together with your thesis. I'm the same kind of autistic as you. The futures I envision aren't much better than yours, they're just slightly different, so this is not some psychological cope. People misunderstand me as well, and 70% of the comments I leave across the internet get no engagement at all, not even negative feedback. But it's alright. We can just see problems approaching many years before they're visible to others. 

you could use the same argument for AIs which are "politically correct"

But those AIs were trained that way because of market demand. The pressure came from consumers and brand reputation, not from safety. The moment that behaviour became clearly suboptimal - like Gemini producing Black Nazis - it was corrected for optimisation. The system wasn’t safe; it was simply failing to optimise, and that’s what triggered the fix.

Now imagine how much more brutal the optimisation pressure becomes when the goal is no longer content moderation, but profit maximisation, military dominance, or locating and neutralising rival AGIs. Structural incentives - not intention - will dictate development. The AI that hesitates to question its objectives will be outperformed by one that doesn’t.

most large companies are not all that reckless at all

But that’s irrelevant. Most might be safe. Only one needs not to be. And in the 2023 OpenAI letter, the company publicly asked for a global slowdown due to safety concerns - only to immediately violate that principle itself. The world ignored the call. OpenAI ignored its own. Why? Because competitive pressure doesn’t permit slowness. Safety is a luxury that gets trimmed the moment stakes rise.

You also suggested that we might not disregard safety until AI becomes far more useful. But usefulness is increasing rapidly, and the assumption that this trajectory will hit a wall is just that - an assumption. What we have now is the dumbest AI we will ever have. And it's already producing emergent behaviours we barely understand. With the addition of quantum computing and more scaled training data, we are likely to see unprecedented capabilities long before regulation or coordination can catch up.

By intelligence, I mean optimisation capability: the ability to model the world, solve complex problems, and efficiently pursue a goal. Smarter means faster pattern recognition, broader generalisation, more strategic foresight. Not just “knowledge,” but the means to use it. If the goal is complex, more intelligence simply means a more competent optimiser.

As for extinction - I don’t say it's possible. I say it’s likely. Overwhelmingly so. I lay out the structural forces, the logic, and the incentives. If you think I’m wrong, don’t just say “maybe.” Show where the logic breaks. Offer a more probable outcome from the same premises. I’m not asking for certainty—I’m asking for clarity. Saying “you haven’t proven it with 100% certainty” isn’t a rebuttal. It’s an escape hatch.

You’re right that your objections are mostly small. I think they’re reasonable, and I welcome them. But none of them, taken together or in isolation, undermine the central claim. The incentives don’t align with caution. And the system selects for performance, not safety.

I appreciate your persistence. And I’m not surprised it’s coming from someone else who’s autistic. Nearly all the thoughtful engagement I’ve had on these forums has come either from another autistic person - or an AI. That should tell us something. You need to think like a machine to even begin to engage with these arguments, let alone accept them.

I've read some of your other replies on here and I think I've found a pattern, but it's actually more general than AI. 

Harmful tendencies outcompete those which aren't harmful

This is true (even outside of AI), but only at the limit. When you have just one person, you cannot tell if he will make the moral choice or not, but "people" will make the wrong choice. The harmful behaviour is emergent at scale. Discrete people don't follow these laws, but the continous person does.

Again, even without AGI, you can apply this idea to technology and determine that it will eventually destroy us, and this is what Ted Kaczynski did. Thinking about incentives in this manner is depressing, because it feels like everything is deterministic and that we can only watch as everything gets worse. Those who are corrupt outcompete those who are not, so all the elites are corrupt. Evil businessmen outcompete good businessmen, so all successful businessmen are evil. Immoral companies outcompete moral companies, so all large companies are immoral.

I think this is starting to be true, but it wasn't true 200 years ago. At least, it wasn't half as harmful as it is now, why? It's because the defense against this problem is human taste, human morals, and human religions. Dishonesty, fraud, selling out, doing what's most efficient with no regard for morality. We consider this behaviour to be in bad taste, we punished it and branded it low-status, so that it never succeeded in ruining everything.

But now, everything could kill us (if the incentives are taken as laws, at least), you don't even need to involve AI. For instance, does Google want to be shut down? No, so they will want to resist antitrust laws. Do they want to be replaced? No, so they will use cruel tricks to kill small emerging competitors. When fines for illegal behaviour are less than the gains Google can make by doing illegal things, they will engage in illegal behaviour, for that is the logical best choice available to Google if all which matters is money. If we let it, Google would take over the world, in fact, it couldn't do otherwise. You can replace "Google" with any powerful structure in which no human is directly in charge. When it starts being more profitable to kill people than it is to keep them alive, the global population will start dropping fast. When you optimize purely for Money, and you optimize strongly enough, everyone dies. An AI just kills us faster because it optimizes more strongly, we already have something which acts similarly to AI. If you optimize too hard for anything, no matter what it is (even love, well-being, or happiness), everyone eventually dies (hence the paperclip maximizer warning).

If this post gave you existential dread, I've been told that Elinor Ostrom's books make for a good antidote.

Please don't see downvotes as rejection. People on LessWrong downvote each other like crazy all the time.

There was this post discussing whether AI are being enslaved which got -46 karma. The top comment compared the post to "taking a big 💩 in public"

The post author later wrote another post in a similar vein, strongly criticizing people who downvoted her last time. Somehow the hivemind changed its mind and upvoted her this time.

Don't take downvotes too seriously :)

Of course, if you're looking for more than just upvotes, but for the post to have a major effect on the community's beliefs and strategies. My pessimistic take is that a single post almost never accomplishes this. Even a great post by Eliezer Yudkowsky usually has negligible influence on the community, except when he gets very lucky.

LessWrong is full of people frustrated at their words falling on deaf ears. Here is an example, Applying superintelligence without collusion, by Eric Drexler. He reflects despairingly that:

At t+7 years, I’ve still seen no explicit argument for robust AI collusion, yet tacit belief in this idea continues to channel attention away from a potential solution-space for AI safety problems, leaving something very much like a void.

I appreciate that - and I can see how someone familiar with the site would interpret it that way. But as a new member, I wouldn't have that context.

And honestly, if it were just downvotes, it wouldn’t be such a problem. The real issue is the hand-waving dismissal of arguments that haven’t even been read, the bad faith responses to claims never made, the strawmen, and above all the consistent avoidance of the core points I lay out.

This is supposed to be a community that values clear thinking and honest engagement. Ironically, I’ve had far more of that elsewhere.

I’ve had emails ignored, responses that amount to “this didn’t come from the right person,” and the occasional reply like this one, from a very prominent member of AI safety:

“Without reading the paper, and just going on your brief description…”

That’s the level of seriousness these ideas are treated with.

I only had time to look at your first post, and then only skimmed it because it's really long. Asking people you don't know to read something of this length is more than you can really expect. People are busy and you're not the only one with demands on their time.

I would advise trying to put something at the beginning to help people understand what you're about to cover and why they should care about it. For the capitalism post, I agree with most of what you said (although some of your bullet points are unsupported assertions), but I still don't know what I'm supposed to take out of this, since ending capitalism isn't tractable, and (as you mention in regards to governments) non-capitalism doesn't help.

It’s absolutely reasonable to not read my essays. They’re long, and no one owes me their time.

But to not read them and still dismiss them - that’s not rigorous. That’s kneejerk. And unfortunately, that’s been the dominant pattern, both here and elsewhere.

I’m not asking for every random person to read a long essay. I’m pointing out that the very people whose job it is to think about existential risk have either (a) refused to engage on ideological grounds, (b) dismissed the ideas based on superficial impressions, or (c) admitted they haven’t read the arguments, then responded anyway. You just did version (c).

You say some of my bullet points were “unsupported assertions,” but you also say you only skimmed. That’s exactly the kind of shallow engagement I’m pointing to. It lets people react without ever having to actually wrestle with the ideas. If the conclusions are wrong, point to why. If not, the votes shouldn’t be doing the work that reasoning is supposed to.

As for tractability: I’m not claiming to offer a solution. I’m explaining why the outcome - human extinction via AGI driven by capitalism - looks inevitable. “That’s probably true, but we can’t do anything about it” is a valid reaction. “That’s too hard to think about, so I’ll downvote and move on” isn’t.

I thought LessWrong was about thinking, not feeling. That hasn’t been my experience here. And that’s exactly what this essay is addressing.