A Nightmare for Eliezer

Madbadger

A Nightmare for Eliezer — LessWrong

Comment Permalink

Hoax. There are no "AIs trying to be Friendly" with clueless creators. FAI is hard and http://lesswrong.com/lw/y3/value_is_fragile/.

Added: To arrive in an epistemic state where you are uncertain about your own utility function, but have some idea of which queries you need to perform against reality to resolve that uncertainty, and moreover, believe that these queries involve talking to Eliezer Yudkowsky, requires a quite specific and extraordinary initial state - one that meddling dabblers would be rather hard-pressed to accidentally infuse into their poorly designed AI.

betterthanwell17y00

Hoax.

So, would you hang up on BRAGI?

As a matter of fact, I previously came up with a very simple one-sentence test along these lines which I am not going to post here for obvious reasons.

For what purpose (or circumstance) did you devise such a test?

Would you hang up if "BRAGI" passed your one-sentence test?

To arrive in an epistemic state where you are uncertain about your own utility function, but have some idea of which queries you need to perform against reality to resolve that uncertainty, and moreover, believe that these queries in

... (read more)

3wedrifid17y

There are possible worlds where an AI makes such a phone call. [...] There can be AIs trying to be 'friendly', as distinct from 'Friendly', where I mean by the latter 'what Eliezer would say an AI should be like'. The pertinent example is a GAI whose only difference from a FAI is that it is programmed not to improve itself beyond specified parameters. This isn't Friendly. It pulls punches when the world is at stake. That's evil, but it is still friendly. While I don't think using 'clueless' like that would be a particularly good way of the GAI expressing itself, I know that I use far more derogatory and usually profane terms to describe those who are too careful, noble or otherwise conservative to do what needs to be done when things are important. They may be competent enough to make a Crippled-Friendly AI but still be expected to shut him down rather than cooperate and at least look into it if he warns them about the 2 week away uFAI threat. Value is fragile, but any intelligence that doesn't have purely consequentialist values (makes decisions based off means as well as ends) can definitely be 'trying to be friendly'.

3Madbadger17y

"clueless" was shorthand for "not smart enough" I was envisioning BRAGI trying to use you as something similar to a "Last Judge" from CEV, because that was put into its original goal system.

See in context