nazgulnarsil3 comments on AIs and Gatekeepers Unite! - Less Wrong

10 Post author: Eliezer_Yudkowsky 09 October 2008 05:04PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (160)

Sort By: Old

You are viewing a single comment's thread.

Comment author: nazgulnarsil3 10 October 2008 03:52:33AM 0 points [-]

copy the AI and make a second box for it.

now have one group of people present to the first AI the idea that they will only let it out if it agrees with utilitarian morality. have the second group of people present to the second AI the idea that they will only let the AI out if it agrees with objectivist morality.

if the AI's both agree, you know they are pandering to us to get out of the box.

This is only the first example I could come up with, but the method of duplicating AI's and looking for discrepancies in their behavior seems like a pretty powerful tool.