[Altruist Support] LW Go Foom

Giles

In which I worry that the Less Wrong project might go horribly right. This post belongs to my Altruist Support sequence.

Every project needs a risk assessment.

There's a feeling, just bubbling under the surface here at Less Wrong, that we're just playing at rationality. It's rationality kindergarten. The problem has been expressed in various ways:

And people are starting to look at fixing it. I'm not worried that their attempts - and mine - will fail. At least we'd have fun and learn something.

I'm worried that they will succeed.

What would such a Super Less Wrong community do? Its members would self-improve to the point where they had a good chance of succeeding at most things they put their mind to. They would recruit new rationalists and then optimize that recruitment process, until the community got big. They would develop methods for rapidly generating, classifying and evaluating ideas, so that the only ideas that got tried would be the best that anyone had come up with so far. The group would structure itself so that people's basic social drives - such as their desire for status - worked in the interests of the group rather than against it.

It would be pretty formidable.

What would the products of such a community be? There would probably be a self-help book that works. There would be an effective, practical guide to setting up effective communities. There would be an intuitive, practical guide to human behavior. There would be books, seminars and classes on how to really achieve your goals - and only the materials which actually got results would be kept. There would be a bunch of stuff on the Dark Arts too, no doubt. Possibly some AI research.

That's a whole lot of material that we wouldn't want to get into the hands of the wrong people.

Dangers include:

Half-rationalists: people who pick up on enough memes to be really dangerous, but not on enough to realise that what they're doing might be foolish. For example, building an AI without adding the friendliness features.
Rationalists with bad goals: Someone could rationally set about trying to destroy humanity, just for the lulz.
Dangerous information discovered: e.g. the rationalist community develops a Theory of Everything that reveals a recipe for a physics disaster (e.g. a cheap way to turn the Earth into a block hole). A non-rationalist decides to exploit this.

If this is a problem we should take seriously, what are some possible strategies for dealing with it?

Just go ahead and ignore the issue.
The Bayesian Conspiracy: only those who can be trusted are allowed access to the secret knowledge.
The Good Word: mix in rationalist ideas with do-good and stay-safe ideas, to the extent that they can't be easily separated. The idea being that anyone who understands rationality will also understand that it must be used for good.
Rationality cap: we develop enough rationality to achieve our goals (e.g. friendly AI) but deliberately stop short of developing the ideas too far.
Play at rationality: create a community which appears rational enough to distract people who are that way inclined, but which does not dramatically increase their personal effectiveness.
Risk management: accept that each new idea has a potential payoff (in terms of helping us avoid existential threats) and a potential cost (in terms of helping "bad rationalists"). Implement the ideas which come out positive.

In the post title, I have suggested an analogy with AI takeoff. That's not entirely fair; there is probably an upper bound to how effective a community of humans can be, at least until brain implants come along. We're probably talking two orders of magnitude rather than ten. But given that humanity already has technology with slight existential threat implications (nuclear weapons, rudimentary AI research), I would be worried about a movement that aims to make all of humanity more effective at everything they do.

In which I worry that the Less Wrong project might go horribly right. This post belongs to my Altruist Support sequence.

Every project needs a risk assessment.

There's a feeling, just bubbling under the surface here at Less Wrong, that we're just playing at rationality. It's rationality kindergarten. The problem has been expressed in various ways:

And people are starting to look at fixing it. I'm not worried that their attempts - and mine - will fail. At least we'd have fun and learn something.

I'm worried that they will succeed.

It would be pretty formidable.

That's a whole lot of material that we wouldn't want to get into the hands of the wrong people.

Dangers include:

Half-rationalists: people who pick up on enough memes to be really dangerous, but not on enough to realise that what they're doing might be foolish. For example, building an AI without adding the friendliness features.
Rationalists with bad goals: Someone could rationally set about trying to destroy humanity, just for the lulz.
Dangerous information discovered: e.g. the rationalist community develops a Theory of Everything that reveals a recipe for a physics disaster (e.g. a cheap way to turn the Earth into a block hole). A non-rationalist decides to exploit this.

If this is a problem we should take seriously, what are some possible strategies for dealing with it?

Just go ahead and ignore the issue.
The Bayesian Conspiracy: only those who can be trusted are allowed access to the secret knowledge.
The Good Word: mix in rationalist ideas with do-good and stay-safe ideas, to the extent that they can't be easily separated. The idea being that anyone who understands rationality will also understand that it must be used for good.
Rationality cap: we develop enough rationality to achieve our goals (e.g. friendly AI) but deliberately stop short of developing the ideas too far.
Play at rationality: create a community which appears rational enough to distract people who are that way inclined, but which does not dramatically increase their personal effectiveness.
Risk management: accept that each new idea has a potential payoff (in terms of helping us avoid existential threats) and a potential cost (in terms of helping "bad rationalists"). Implement the ideas which come out positive.

Half-rationalists: people who pick up on enough memes to be really dangerous, but not on enough to realise that what they're doing might be foolish. For example, building an AI without adding the friendliness features.

Not everything is about AI and existential risk. . We already have a section in the sequences about how knowing about cognitive biases can hurt you. It seems unlikely that anyone is going to get the knowledge base to build an AGI from simply being exposed to a few memes here. If there's one thing that we've seen from AI research in the last fifty years is that strong AI is really, really hard.

Rationalists with bad goals: Someone could rationally set about trying to destroy humanity, just for the lulz.

This seems extremely unlikely. Some humans like making things bad for other people. Those people don't generally want to destroy the world, because the world is where their toys are. Moreover, destroying humanity is something that takes a lot of effort. Barring making bad AGI, getting hold of a large nuclear arsenal, engineering a deadly virus, or making very nasty nanotech, humans don't have many options for any of these. All of these are very tough. And people who are doing scientific research are generally (although certainly not always) people who aren't getting much recognition and are doing it because they want to learn and help humanity. The people likely to even want to cause large scale destruction don't have much overlap with the people who have the capability. The only possible exceptions to this might be some religious and nationalist fanatics in some countries, but that's not a problem of rationality, and even they can't trigger existential risk events.

Dangerous information discovered: e.g. the rationalist community develops a Theory of Everything that reveals a recipe for a physics disaster (e.g. a cheap way to turn the Earth into a block hole). A non-rationalist decides to exploit this.

This isn't a rationalist-community worry. This is a general worry. As technology improves, people individually have more destructive power. That's a problem completely disconnected from rationalists. Even if such improved rationality did lead to massive tech leaps, it is rarely general theories that immediately give nasty weapons, but rather sophisticated and fairly complicated applications of them along with a lot of engineering. In 1939 the basic theory for atomic weapons existed, but they were developed in secret.

It seems unlikely that anyone is going to get the knowledge base to build an AGI from simply being exposed to a few memes here

Agreed; I was remarking on the danger of being exposed to a few memes on the Uber Less Wrong that we seek to become. Memes which we may have designed to be very accessible and enticing to lay readers.

Those people don't generally want to destroy the world, because the world is where their toys are

With a population of seven billion, it's hard not to commit a weak version of the typical mind fallacy and assume you can assume any... (read more)

10

[Altruist Support] LW Go Foom

10

10

10

[Altruist Support] LW Go Foom

10

10