Hedging our Bets: The Case for Pursuing Whole Brain Emulation to Safeguard Humanity's Future

inklesspen

14 Hedging our Bets: The Case for Pursuing Whole Brain Emulation to Safeguard Humanity's Future

1st Mar 2010

4 min read

14

It is the fashion in some circles to promote funding for Friendly AI research as a guard against the existential threat of Unfriendly AI. While this is an admirable goal, the path to Whole Brain Emulation is in many respects more straightforward and presents fewer risks. Accordingly, by working towards WBE, we may be able to "weight" the outcome probability space of the singularity such that humanity is more likely to survive.

One of the potential existential risks in a technological singularity is that the recursively self-improving agent might be inimical to our interests, either through actual malevolence or "mere" indifference towards the best interests of humanity. Eliezer has written extensively on how a poorly-designed AI could lead to this existential risk. This is commonly termed Unfriendly AI.

Since the first superintelligence can be presumed to have an advantage over any subsequently-arising intelligences, Eliezer and others advocate funding research into creating Friendly AI. Such research must not only reverse-engineer consciousness, but also human notions of morality. Unfriendly AI could potentially require only sufficiently fast hardware to evolve an intelligence via artificial life, as depicted in Greg Egan's short story "Crystal Nights", or it may be created inadvertently by researchers at the NSA or a similar organization. It may be that creating Friendly AI is significantly harder than creating Unfriendly (or Indifferent) AI, perhaps so much so that we are unlikely to achieve it in time to save human civilization.

Fortunately, there's a short-cut we can take. We already have a great many relatively stable and sane intelligences. We merely need to increase their rate of self-improvement. As far as I can tell, developing mind uploading via WBE is a simpler task than creating Friendly AI. If WBE is fast enough to constitute an augmented intelligence, then our augmented scientists can trigger the singularity by developing more efficient computing devices. An augmented human intelligence may have a slower "take-off" than a purpose-built intelligence, but we can reasonably expect it to be much easier to ensure such a superintelligence is Friendly. In fact, this slower take-off will likely be to our advantage; it may increase our odds of being able to abort an Unfriendly singularity.

WBE may also be able to provide us with useful insights into the nature of consciousness, which will aid Friendly AI research. Even if it doesn't, it gets us most of the practical benefits of Friendly AI (immortality, feasible galactic colonization, etc) and makes it possible to wait longer for the rest of the benefits.

But what if I'm wrong? What if it's just as easy to create an AI we think is Friendly as it is to upload minds into WBE? Even in that case, I think it's best to work on WBE first. Consider the following two worlds: World A creates an AI its best scientists believes is Friendly and, after a best-effort psychiatric evaluation (for whatever good that might do) gives it Internet access. World B uploads 1000 of its best engineers, physicists, psychologists, philosophers, and businessmen (someone's gotta fund the research, right?). World B seems to me to have more survivable failure cases; if some of the uploaded individuals turn out to be sociopaths, the rest of them can stop the "bad" uploads from ruining civilization. It seems exceedingly unlikely that we would select a large enough group of sociopaths that the "good" uploads can't keep the "bad" uploads in check.

Furthermore, the danger of uploading sociopaths (or people who become sociopathic when presented with that power) is also a danger that the average person can easily comprehend, compared to the difficulty of ensuring Friendliness of an AI. I believe that the average person is also more likely to recognize where attempts at safeguarding an upload-triggered singularity may go wrong.

The only downside of this approach I can see is that an upload-triggered Unfriendly singularity may cause more suffering than an Unfriendly AI singularity; sociopaths may be presumed to have more interest in torture of people than a paperclip-optimizing AI would have.

Suppose, however, that everything goes right, the singularity occurs, and life becomes paradise by our standards. Can we predict anything of this future? It's a popular topic in science fiction, so many people certainly enjoy the effort. Depending on how we define a "Friendly singularity", there could be room for a wide range of outcomes.

Perhaps the AI rules wisely and well, and can give us anything we want, "save relevance". Perhaps human culture adapts well to the utopian society, as it seems to have done in the universe of The Culture. Perhaps our uploaded descendants set off to discover the secrets of the universe. I think the best way to ensure a human-centric future is to be the self-improving intelligences, instead of merely catching crumbs from the table of our successors.

In my view, the worst kind of "Friendly" singularity would be one where we discover we've made a weakly godlike entity who believes in benevolent dictatorship; if we must have gods, I want them to be made in our own image, beings who can be reasoned with and who can reason with one another. Best of all, though, is that singularity where we are the motivating forces, where we need not worry if we are being manipulated "in our best interest".

Ultimately, I want the future to have room for our mistakes. For these reasons, we ought to concentrate on achieving WBE and mind uploading first.

Economic Consequences of AGIWhole Brain Emulation

Personal Blog

14

New Comment

Rendering 0/248 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 4:38 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

14 Hedging our Bets: The Case for Pursuing Whole Brain Emulation to Safeguard Humanity's Future

by inklesspen

1st Mar 2010

4 min read

248

14

Ultimately, I want the future to have room for our mistakes. For these reasons, we ought to concentrate on achieving WBE and mind uploading first.

Economic Consequences of AGIWhole Brain Emulation

Personal Blog

14

New Comment

Rendering 0/248 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 4:38 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

More from inklesspen

Curated and popular this week

248Comments

248

Comment Permalink

Jordan16y40

There is going to be value drift even if we get an FAI. Isn't that inherent in extrapolated volition? We don't really want our current values, we want the values we'll have after being smarter and having time to think deeply about them. The route of WBE simply takes the guess work out: actually make people smarter, and then see what the drifted values are. Of course, it's important to keep a large, diverse culture in the process, so that the whole can error correct for individuals that go off the deep end, analogous to why extrapolated volition would be based on the entire human population rather than a single person.

Vladimir_Nesov16y20

There is going to be value drift even if we get an FAI. Isn't that inherent in extrapolated volition?

No. Progress and development may be part of human preference, but it is entirely OK for a fixed preference to specify progress happening in a particular way, as opposed to other possible ways. Furthermore, preference can be fixed and still not knowable in advance (so that there are no spoilers, and moral progress happens through your effort and not dictated "from above").

It's not possible to efficiently find out some properties of a program, e... (read more)

4andreas16y

Here is a potentially more productive way of seeing this situation: We do want our current preferences to be made reality (because that's what the term preference describes), but we do not know what our preferences look like, part of the reason being that we are not smart enough and do not have enough time to think about what they are. In this view, our preferences are not necessarily going to drift if we figure out how to refer to human preference as a formal object and if we build machines that use this object to choose what to do — and in this view, we certainly don't want our preferences to drift. On the other hand, WBE does not "simply take the guess work out". It may be the case that the human mind is built such that "making people smarter" is feasible without changing preference much, but we don't know that this is the case. As long as we do not have a formal theory of preference, we cannot strongly believe this about any given intervention – and if we do have such a theory, then there exist better uses for this knowledge.

See in context