Hedging our Bets: The Case for Pursuing Whole Brain Emulation to Safeguard Humanity's Future

inklesspen

14 Hedging our Bets: The Case for Pursuing Whole Brain Emulation to Safeguard Humanity's Future

1st Mar 2010

4 min read

14

It is the fashion in some circles to promote funding for Friendly AI research as a guard against the existential threat of Unfriendly AI. While this is an admirable goal, the path to Whole Brain Emulation is in many respects more straightforward and presents fewer risks. Accordingly, by working towards WBE, we may be able to "weight" the outcome probability space of the singularity such that humanity is more likely to survive.

One of the potential existential risks in a technological singularity is that the recursively self-improving agent might be inimical to our interests, either through actual malevolence or "mere" indifference towards the best interests of humanity. Eliezer has written extensively on how a poorly-designed AI could lead to this existential risk. This is commonly termed Unfriendly AI.

Since the first superintelligence can be presumed to have an advantage over any subsequently-arising intelligences, Eliezer and others advocate funding research into creating Friendly AI. Such research must not only reverse-engineer consciousness, but also human notions of morality. Unfriendly AI could potentially require only sufficiently fast hardware to evolve an intelligence via artificial life, as depicted in Greg Egan's short story "Crystal Nights", or it may be created inadvertently by researchers at the NSA or a similar organization. It may be that creating Friendly AI is significantly harder than creating Unfriendly (or Indifferent) AI, perhaps so much so that we are unlikely to achieve it in time to save human civilization.

Fortunately, there's a short-cut we can take. We already have a great many relatively stable and sane intelligences. We merely need to increase their rate of self-improvement. As far as I can tell, developing mind uploading via WBE is a simpler task than creating Friendly AI. If WBE is fast enough to constitute an augmented intelligence, then our augmented scientists can trigger the singularity by developing more efficient computing devices. An augmented human intelligence may have a slower "take-off" than a purpose-built intelligence, but we can reasonably expect it to be much easier to ensure such a superintelligence is Friendly. In fact, this slower take-off will likely be to our advantage; it may increase our odds of being able to abort an Unfriendly singularity.

WBE may also be able to provide us with useful insights into the nature of consciousness, which will aid Friendly AI research. Even if it doesn't, it gets us most of the practical benefits of Friendly AI (immortality, feasible galactic colonization, etc) and makes it possible to wait longer for the rest of the benefits.

But what if I'm wrong? What if it's just as easy to create an AI we think is Friendly as it is to upload minds into WBE? Even in that case, I think it's best to work on WBE first. Consider the following two worlds: World A creates an AI its best scientists believes is Friendly and, after a best-effort psychiatric evaluation (for whatever good that might do) gives it Internet access. World B uploads 1000 of its best engineers, physicists, psychologists, philosophers, and businessmen (someone's gotta fund the research, right?). World B seems to me to have more survivable failure cases; if some of the uploaded individuals turn out to be sociopaths, the rest of them can stop the "bad" uploads from ruining civilization. It seems exceedingly unlikely that we would select a large enough group of sociopaths that the "good" uploads can't keep the "bad" uploads in check.

Furthermore, the danger of uploading sociopaths (or people who become sociopathic when presented with that power) is also a danger that the average person can easily comprehend, compared to the difficulty of ensuring Friendliness of an AI. I believe that the average person is also more likely to recognize where attempts at safeguarding an upload-triggered singularity may go wrong.

The only downside of this approach I can see is that an upload-triggered Unfriendly singularity may cause more suffering than an Unfriendly AI singularity; sociopaths may be presumed to have more interest in torture of people than a paperclip-optimizing AI would have.

Suppose, however, that everything goes right, the singularity occurs, and life becomes paradise by our standards. Can we predict anything of this future? It's a popular topic in science fiction, so many people certainly enjoy the effort. Depending on how we define a "Friendly singularity", there could be room for a wide range of outcomes.

Perhaps the AI rules wisely and well, and can give us anything we want, "save relevance". Perhaps human culture adapts well to the utopian society, as it seems to have done in the universe of The Culture. Perhaps our uploaded descendants set off to discover the secrets of the universe. I think the best way to ensure a human-centric future is to be the self-improving intelligences, instead of merely catching crumbs from the table of our successors.

In my view, the worst kind of "Friendly" singularity would be one where we discover we've made a weakly godlike entity who believes in benevolent dictatorship; if we must have gods, I want them to be made in our own image, beings who can be reasoned with and who can reason with one another. Best of all, though, is that singularity where we are the motivating forces, where we need not worry if we are being manipulated "in our best interest".

Ultimately, I want the future to have room for our mistakes. For these reasons, we ought to concentrate on achieving WBE and mind uploading first.

Economic Consequences of AGIWhole Brain Emulation

Personal Blog

14

New Comment

Rendering 0/248 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 1:57 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

14 Hedging our Bets: The Case for Pursuing Whole Brain Emulation to Safeguard Humanity's Future

by inklesspen

1st Mar 2010

4 min read

248

14

Ultimately, I want the future to have room for our mistakes. For these reasons, we ought to concentrate on achieving WBE and mind uploading first.

Economic Consequences of AGIWhole Brain Emulation

Personal Blog

14

New Comment

Rendering 0/248 comments, sorted by

top scoring

(show more) Click to highlight new comments since: Today at 1:57 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

Moderation Log

More from inklesspen

Curated and popular this week

248Comments

248

Comment Permalink

Mitchell_Porter16y100

Yes, it's not 100% certain that a self-enhancing AI will overwhelm human resistance. One can construct scenarios where the process is halted in time. For example, the AI's cognition can be monitored, and everything halted if a goal like "prevent the stop button from being pressed" ever crosses its mind. Or the human programmers can remind themselves to get scared if they find the AI especially preoccupied with modeling their own thinking.

But as an AI develops, it will become increasingly difficult for its human creators to keep track of everything it thinks, plans, and considers. Perhaps the greatest dangers will be those stratagems which are not explicitly (in the thought process) motivated by the desire to escape human control. The AI simply has a goal, X, and it perceives that ABCDEF... combined will facilitate X, and it all looks innocent to the human supervisors. But actually ABC is a miracle self-hack which lets a plan get developed without being analyzed by module D, and module D has the property E of making planning really slow, which by heuristic F is a bad thing. Unfortunately, module D is part of the automatic system which looks for consideration of dangerous plans. So now the AI is off the leash and no-one will even know until it's too late.

That particular example may still be unrealistically simple (though of course, in real life, big important organizations do make mistakes that simple, despite the existence of quality control and internal auditing). But as the method gets more and more complicated, the more likely it is that an AI would indeed successfully escape its handcuffs through such a measure, and not even because it especially prioritized escape; simply because it noticed that if it could reach over there it could do something useful, and it noticed that it would be able to do so if did something complicated and clever with its "hands".

There are still countermeasures; you could look in the AI's thoughts, not just for explicit consideration of ways to circumvent control, but for consideration of scenarios in which it has freedoms it should not be able to obtain. But then the AI might act indirectly, through a sequence of other entities, and so you would only see it thinking about the actions of those surrogates, unless you managed to put the big picture together. OK then, so we need to go to a higher level, and tell the AI to monitor itself. But that's only going to work if it knows that it shouldn't do a certain thing, which means that getting the goals right is supremely important - which brings us back to the pursuit of Friendly AI, and the attempt to figure out just what the overall "morality" of an AI should be.

timtyler16y00

My analysis of the situation is here:

http://alife.co.uk/essays/stopping_superintelligence/

It presents an approach which doesn't rely on "handcuffing" the agent.

See in context