Snowyowl comments on AI Box Log - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (30)
It's worse than that. The AI could say, "Look, here is a proof of FAI. Here is my code showing that I have implemented the friendliness modification." The proof and the code are utterly convincing, except erroneous in a subtle way that the gatekeeper is not smart enough to detect. Game over.
Three years late, but: there doesn't even have to be an error. The Gatekeeper still loses for letting out a Friendly AI, even if it actually is Friendly.