Eliezer_Yudkowsky comments on What's special about a fantastic outcome? Suggestions wanted. - Less Wrong

0 Post author: Stuart_Armstrong 11 November 2014 11:04AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (19)

You are viewing a single comment's thread.

Comment author: Eliezer_Yudkowsky 12 November 2014 06:16:00AM 2 points [-]

I presume the answer you're looking for isn't "fun theory", but I can't tell from OP whether you're looking for distinguishers from our perspective or from an AI's perspective.

Comment author: Stuart_Armstrong 13 November 2014 06:39:10PM 2 points [-]

I'm looking for something generic that is easy to measure. At a crude level, if the only option were "papercliper" vs FAI, then we could distinguish those worlds by counting steel content.

So basically some more or less objective measure that has a higher proportion of good outcomes than the baseline.

Comment author: Eliezer_Yudkowsky 15 November 2014 04:21:10AM 2 points [-]

Merely higher proportion, and we're not worried about the criterion being reverse-engineered? Give a memory expert a large prime number to memorize and talk about outcomes where it's possible to factor a large composite number that has that prime as a factor. Happy outcomes will have that memory expert still be around in some form.

EDIT: No, I take that back because quantum. Some repaired version of the general idea might still work, though.

Comment author: Stuart_Armstrong 16 November 2014 08:21:45AM 1 point [-]

we're not worried about the criterion being reverse-engineered?

I'm trying to think about ways that might potentially prevent reverse engineering...

Comment author: Leonhart 13 November 2014 09:52:36PM 1 point [-]

Smiles, laughter, hugging, the humming or whistling of melodies in a major key, skipping, high-fiving and/or brofisting, loud utterance of "Huzzah" or "Best thing EVER!!!", airborne nanoparticles of cake, streamers, balloons, accordion music? On the assumption that the AI was not explicitly asked to produce these things, of course.

Comment author: Gurkenglas 12 December 2014 06:40:26PM *  0 points [-]

If you're planning to simulate an AI in a universe in a box and examine whether the produced universe is good via some process that doesn't allow the AI to talk to you, the AI is just gonna figure out it's being simulated and pretend to be an FAI (Note that an AI that pretends to be an FAI maximizes not for friendliness, but for apparent friendliness, so this is no pathway to FAI) so you'll let it loose on the real world.

To the first approximation, having a box that contains an AI anywhere in it output even a few bits of info tends to choose those bits that maximize the AI's IRL utility.

(If you have math-proofs that no paperclipper can figure out that it's in your box, it's just gonna maximize a mix of its apparent friendliness score and the number of paperclips (whether it is let loose in the box or in the real world), which doesn't cost it much compared to either maximizing paperclips or apparent friendliness because of the tails thing)