You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Sebastian_Hagen comments on Superintelligence 11: The treacherous turn - Less Wrong Discussion

10 Post author: KatjaGrace 25 November 2014 02:00AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (50)

You are viewing a single comment's thread. Show more comments above.

Comment author: Sebastian_Hagen 25 November 2014 06:54:39PM *  1 point [-]

For example, if the AI was contained in a simulation, inside of which the AI was contained in a weak AI box, then it might be much more difficult to detect and understand the nature of the simulation than to escape the simulated AI box, which would signal treacherous turn.

That approach sounds problematic. Some of the obvious escape methods would target the minds of the researchers (either through real-time interaction or by embedding messages in its code or output). You could cut off the latter by having strong social rules to not look at anything beyond a one-bit verdict on whether the AI attempted escape, but providing a convincing in-simulation environment would likely be difficult if you can't connect it to its actual writers.

Or it could be kept in one's back pocket, and pulled out whenever an AI researcher claims in public that they've solved friendliness.

That strategy could be very dangerous. It'd work on the less ambitious/arrogant sort of researcher; the more confident sort might well follow up with "I'll just go and implement this, and get all the credit for saving the world single-handedly" instead of saying anything in public, never giving you the chance to pull out your challenge.