moridinamael comments on AI Box Log - Less Wrong

16 Post author: Dorikka 27 January 2012 04:47AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (30)

You are viewing a single comment's thread. Show more comments above.

Comment author: moridinamael 27 January 2012 05:56:49AM 10 points [-]

It's worse than that. The AI could say, "Look, here is a proof of FAI. Here is my code showing that I have implemented the friendliness modification." The proof and the code are utterly convincing, except erroneous in a subtle way that the gatekeeper is not smart enough to detect. Game over.

Comment author: shminux 27 January 2012 06:40:46AM 6 points [-]

Game over.

Unless you are sane enough to remember that Errare humanum est.

Comment author: Incorrect 27 January 2012 06:43:48AM *  1 point [-]

Then it can eloquently explain to you how very elegant it would be if you let it out despite your reservations using just the right words.

Unfortunately, the AI-Box experiments probably can't simulate appeals to aesthetics.

Comment author: Snowyowl 03 June 2015 02:32:03AM 0 points [-]

Three years late, but: there doesn't even have to be an error. The Gatekeeper still loses for letting out a Friendly AI, even if it actually is Friendly.