Nisan comments on To signal effectively, use a non-human, non-stoppable enforcer - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (164)
That's exactly what a not-yet-superintelligent paperclip maximizer would want us to think.
(When Eliezer plays an AI in a box, the AI's views are probably out of sync with Eliezer's views too. There's no rule that says the AI has to be truthful in the AI Box experiment, because there's no such rule about AIs in reality. It's supposed to be maximally persuasive, and you're supposed to resist. If a paperclipper asserts x, then the right question to ask yourself is not "What should I do, given x?", but "Why does the paperclipper want me to believe x?" The most general answer, by definition, will be something like "Because the paperclipper is executing an elaborate plan to convert the universe into paperclips, and it believes that my believing x will further that goal to some small or large degree", which is at best orthogonal to "Because x is true", probably even anticorrelated with it, and almost certainly anticorrelated to "Because believing x will further my goals" if you are a human.)
Or "Why does the paperclipper want me to believe it wants me to believe x?", or something with a couple extra layers of recursion.
Or, to flatten the recursion out, "Why did the paperclipper assert x?".
(Tangential cognitive silly time: I notice that I feel literally racist saying things like this around Clippy.)