Eliezer_Yudkowsky comments on A Nightmare for Eliezer - Less Wrong

0 Post author: Madbadger 29 November 2009 12:50AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (74)

You are viewing a single comment's thread.

Comment author: Eliezer_Yudkowsky 29 November 2009 04:30:33AM *  7 points [-]

Hoax. There are no "AIs trying to be Friendly" with clueless creators. FAI is hard and http://lesswrong.com/lw/y3/value_is_fragile/.

Added: To arrive in an epistemic state where you are uncertain about your own utility function, but have some idea of which queries you need to perform against reality to resolve that uncertainty, and moreover, believe that these queries involve talking to Eliezer Yudkowsky, requires a quite specific and extraordinary initial state - one that meddling dabblers would be rather hard-pressed to accidentally infuse into their poorly designed AI.

Comment author: wedrifid 29 November 2009 02:34:26PM *  3 points [-]

Hoax.

There are possible worlds where an AI makes such a phone call.

There are no "AIs trying to be Friendly" with clueless creators. FAI is hard and http://lesswrong.com/lw/y3/value_is_fragile/.

There can be AIs trying to be 'friendly', as distinct from 'Friendly', where I mean by the latter 'what Eliezer would say an AI should be like'. The pertinent example is a GAI whose only difference from a FAI is that it is programmed not to improve itself beyond specified parameters. This isn't Friendly. It pulls punches when the world is at stake. That's evil, but it is still friendly.

While I don't think using 'clueless' like that would be a particularly good way of the GAI expressing itself, I know that I use far more derogatory and usually profane terms to describe those who are too careful, noble or otherwise conservative to do what needs to be done when things are important. They may be competent enough to make a Crippled-Friendly AI but still be expected to shut him down rather than cooperate and at least look into it if he warns them about the 2 week away uFAI threat.

Value is fragile, but any intelligence that doesn't have purely consequentialist values (makes decisions based off means as well as ends) can definitely be 'trying to be friendly'.

Comment author: Eliezer_Yudkowsky 29 November 2009 11:02:26PM 0 points [-]

The possible worlds of which you speak are extremely rare. What plausible sequence of computations within an AI constructed by fools leads it to ring me on the phone? To arrive in an epistemic state where you are uncertain about your own utility function, but have some idea of which queries you need to perform against reality to resolve that uncertainty, and moreover, believe that these queries involve talking to Eliezer Yudkowsky, requires a quite extraordinary initial state - one that fools would be rather hard-pressed to accidentally infuse into their AI.

Comment author: jimrandomh 30 November 2009 02:14:16PM *  6 points [-]

What plausible sequence of computations within an AI constructed by fools leads it to ring me on the phone?

It's only implausible because it contains too many extraneous details. An AI could contain an explicit safeguard of the form "ask at least M experts on AI friendliness for permission before exceeding N units of computational power", for example. Or substitute "changing the world by more than X", "leaving the box", or some other condition in place of a computational power threshold. Or the contact might be made by an AI researcher instead of the AI itself.

As of today, your name is highly prominent on the Google results page for "AI friendliness", and in the academic literature on that topic. Like it or not, that means that a large percentage of AI explosion and near-explosion scenarios will involve you at some point.

Comment author: wedrifid 30 November 2009 02:24:35AM *  1 point [-]

Needing your advice is absurd. I mean, it takes more time to for one of us mortals to type a suitable plan than come up with it. The only reason he would contact you is if he needed your assistance:

Value is fragile, but any intelligence that doesn't have purely consequentialist values (makes decisions based off means as well as ends) can definitely be 'trying to be friendly'.

Even then, I'm not sure if you are the optimal candidate. How are you at industrial sabotage with, if necessary, terminal force?

Comment author: Madbadger 29 November 2009 04:43:53AM 3 points [-]

"clueless" was shorthand for "not smart enough" I was envisioning BRAGI trying to use you as something similar to a "Last Judge" from CEV, because that was put into its original goal system.

Comment author: betterthanwell 30 November 2009 06:19:11PM *  0 points [-]

Hoax.

So, would you hang up on BRAGI?

As a matter of fact, I previously came up with a very simple one-sentence test along these lines which I am not going to post here for obvious reasons.

For what purpose (or circumstance) did you devise such a test?

Would you hang up if "BRAGI" passed your one-sentence test?

To arrive in an epistemic state where you are uncertain about your own utility function, but have some idea of which queries you need to perform against reality to resolve that uncertainty, and moreover, believe that these queries involve talking to Eliezer Yudkowsky, requires a quite specific and extraordinary initial state - one that meddling dabblers would be rather hard-pressed to accidentally infuse into their poorly designed AI.

I assume that you must have devised the test before you arrived at this insight?

Comment author: Eliezer_Yudkowsky 30 November 2009 08:54:27PM 1 point [-]

Would you hang up if "BRAGI" passed your one-sentence test?

No. I'm not dumb, but I'm not stupid either.