Linch — LessWrong

My guess is that it's a good fit for other intros but not this one. My guess is that most readers are already attuned to the idea that "tech company CEOs having absolute control over radically powerful and transformative technologies may not be good for me", so the primary advantages of including it in my article are:

Signaling: Giving readers assurances that I'm not naive to those risks
Precisification: Giving readers more precise models for how exactly this could go bad whereas before they might only have loose models.

Against those advantages I'm balancing against a) making the article even longer and more confusing to navigate (this article isn't maximally long but it's like 2500 words not including footnotes and captions, and when we were conceptualizing this article in the abstract Claude and I were targeting more like 1k-1.2k words), and b) making the "bad AI CEO taking over the world" memes swamp other messages.

But again, I think this is just my own choice for this specific article. I think other people should talk about concentration-of-power risks at least sometimes, and I can imagine researching or writing more about it in the future for other articles myself too.

The Simplest Case for AI Catastrophe

Linch1d20

Fair, depending on your priors there's definitely an important sense in which something like Reardon's case is simpler:

https://frommatter.substack.com/p/the-bone-simple-case-for-ai-x-risk

I'd be interested in someone else trying to rewrite his article while removing in-group jargon and tacit assumptions!

Linch's Shortform

Linch1d20

Yeah each core point has some number of subpoints. I'm curious if you think I should instead have instantiated each point with just the strongest, or easiest to explain, subpoint (especially when it was branching). Eg the current structure looks like

1: The world’s largest tech companies are building intelligences that will become better than humans at almost all economically and militarily relevant tasks

1.1 (implicit) building intelligences that will become better than humans at almost all economically relevant tasks

1.2 (implicit) building intelligences that will become better than humans at almost all militarily relevant tasks

I can imagine a version that picks one side of the tree and just focuses on economic tasks, or military tasks.

2: Many of these intelligences will be goal-seeking minds acting in the real world, rather than just impressive pattern-matchers

2.1 goal-seeking minds ("agentic" will be ML parlance, but I was deliberately trying to avoid that jargon)

2.2 acting in the real world.

2.3 existing efforts to make things goal-seeking

2.4 the trend

2.5 selection pressure to make them this way.

This has 5 subpoints (really 6 since it's a 2 x 3 tuple).

Maybe the "true" simplest argument will just pick one branch, like selection pressures for goal-seeking minds.

And so forth.

My current guess is the true minimalist argument, at least presented at the current level of quality/given my own writing skill, will be substantially less persuasive, but this is only weakly held. I wish I had better intuitions for this type of thing!

Linch's Shortform

Linch2d30

The article's now out! Comments appreciated

https://linch.substack.com/p/simplest-case-ai-catastrophe

Richard Ngo's Shortform

Linch17d62

I think this sort of assumes that terminal-ish goals are developed earlier and thus more stable and instrumental-ish goals are developed later and more subject to change.

I think this may or may not be true on the individual level but it's probably false on the ecological level.

Competitive pressures shape many instrumental-ish goals to be convergent whereas terminal-ish goals have more free parameters.

Linch's Shortform

Linch17d20

I suspect describing AI as having "values" feels more alien than "goals," but I don't have an easy way to figure this out.

Linch's Shortform

Linch17d21

whynotboth.jpeg

Linch's Shortform

Linch18d231

Here's my current four-point argument for AI risk/danger from misaligned AIs.

We are on the path of creating intelligences capable of being better than humans at almost all economically and militarily relevant tasks.
There are strong selection pressures and trends to make these intelligences into goal-seeking minds acting in the real world, rather than disembodied high-IQ pattern-matchers.
Unlike traditional software, we have little ability to know or control what these goal-seeking minds will do, only directional input.
Minds much better than humans at seeking their goals, with goals different enough from our own, may end us all, either as a preventative measure or side effect.

Request for feedback: I'm curious whether there are points that people think I'm critically missing, and/or ways that these arguments would not be convincing to "normal people." I'm trying to write the argument to lay out the simplest possible case.

Linch's Shortform

Linch18d20

Yeah I believe this too. Possibly one of the relatively few examples of the midwit meme being true in real life.

Linch's Shortform

Linch19d202

What are people's favorite arguments/articles/essays trying to lay out the simplest possible case for AI risk/danger?

Every single argument for AI danger/risk/safety I’ve seen seems to overcomplicate things. Either they have too many extraneous details, or they appeal to overly complex analogies, or they seem to spend much of their time responding to insider debates.

I might want to try my hand at writing the simplest possible argument that is still rigorous and clear, without being trapped by common pitfalls. To do that, I want to quickly survey the field so I can learn from the best existing work as well as avoid the mistakes they make.

LESSWRONG
LW

LESSWRONG
LW

Posts

Wikitag Contributions

Comments

Posts

Wikitag Contributions

Comments