Rolf_Nelson2 comments on Pascal's Mugging: Tiny Probabilities of Vast Utilities - Less Wrong

39 Post author: Eliezer_Yudkowsky 19 October 2007 11:37PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (334)

Sort By: Old

You are viewing a single comment's thread.

Comment author: Rolf_Nelson2 20 October 2007 05:21:59PM 3 points [-]

Re: "However clever your algorithm, at that level, something's bound to confuse it. Gimme FAI with checks and balances every time."

I agree that a mature Friendly Artificial Intelligence should defer to something like humanity's volition.

However, before it can figure out what humanity's volition is and how to accomplish it, an FAI first needs to:

1. self-improve into trans-human intelligence while retaining humanity's core goals 2. avoid UnFriendly Behavior (for example, murdering people to free up their resources) in the process of doing step (1)

If the AI falls prey to a paradoxes early on in the process of self-improvement, the FAI has failed and has to be shut down or patched.

Why is that a problem? Because if the AI falls prey to a paradox later on in the process of self-improvement, when the computer can outsmart human beings, the result could be catastrophic. (As Eliezer keeps pointing out: a rational AI might not *agree* to be patched, just as Gandhi would not *agree* to have his brain modified into becoming a psychopath, and Hitler would not *agree* to have his brain modified to become an egalitarian. All things equal, rational agents will try to block any actions that would prevent them from accomplishing their current goals.)

So you want to create an elegant (to the point, ideally, of being "provably correct") structure that doesn't need patches or hacks. If you have to constantly patch or hack early on in the process, that increases the chances that you've missed something fundamental, and that the AI will fail later on, when it's too late to patch.