Alexei comments on What can you do with an Unfriendly AI? - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (127)
I figure the AI will be smart enough to recognize the strategy you are using. In that case, it can choose to not cooperate and output a malicious solution. If the different AIs you are running are similar enough, it's not improbable for all of them to come to the same conclusion. In fact, I feel like there is a sort of convergence for the most "unfriendly" output they can give. If that's the case, all UFAIs will give the same output.
It can choose to not cooperate, but it will only do so if thats what it wants to do. The genie I have described wants to cooperate. An AGI of any of the forms I have described would want to cooperate. Now you can claim that I can't build an AGI with any easily controlled utility function at all, but this is very much a harder claim.