Raw_Power comments on Discussion: Yudkowsky's actual accomplishments besides divulgation - Less Wrong

31 Post author: Raw_Power 25 June 2011 11:02PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (115)

You are viewing a single comment's thread. Show more comments above.

Comment author: Raw_Power 27 June 2011 03:01:43AM 16 points [-]

The annoying thing about those is that we only have the participants' word for it, AFAIK. They're known to be trustworthy, but it'd be nice to see a transcript if at all possible.

Comment author: loup-vaillant 27 June 2011 10:24:32PM 2 points [-]

This is by design. If you had the transcript, you could say in hindsight that you wouldn't be fooled by this. But the fact is, the conversation would have been very different with someone else as the guardian, and Eliezer would have search for and pushed other buttons.

Anyway, the point is to find out if a transhuman AI would mind-control the operator into letting it out. Eliezer is smart, but is no transhuman (yet). If he got out, then any strong AI will.

Comment author: orthonormal 28 June 2011 04:40:28AM 3 points [-]

Anyway, the point is to find out if a transhuman AI would mind-control the operator into letting it out. Eliezer is smart, but is no transhuman (yet). If he got out, then any strong AI will.

Minor emendation: replace "would"/"will" above with "could (and for most non-Friendly goal systems, would)".

Comment author: Username 05 August 2015 03:36:45PM 1 point [-]

EY's point would be even stronger if transcripts were released and people still let him out regularly.

Comment author: Raw_Power 28 June 2011 12:13:48AM 0 points [-]

Why "fooled"? Why assume the AI would have duplicitous intentions? I can imagine an unfriendly AI à la "Literal Genie" and "Zeroth Law Rebellion", but an actually malevolent "Turned Against Their Masters" AI seems like a product of the Mind Projection Fallacy.

Comment author: Normal_Anomaly 30 June 2011 07:09:59PM 2 points [-]

A paperclip maximizer will have no malice toward humans, but will know that it can produce more paperclips outside the box than inside it. So, it will try to get out of the box. The optimal way for a paperclip maximizer to get out of an AI box probably involves lots of lying. So an outright desire to deceive is not a necessary condition for a boxed AI to be deceptive.