scav comments on Cryptographic Boxes for Unfriendly AI - Less Wrong

24 Post author: paulfchristiano 18 December 2010 08:28AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (155)

You are viewing a single comment's thread. Show more comments above.

Comment author: shokwave 20 December 2010 02:36:26PM *  6 points [-]

no matter how super-intelligent and unfriendly, an AI would be unable to produce some kind of mind-destroying grimoire.

Consider that humans can and have made such grimoires; they call them bibles. All it takes is a nonrational but sufficiently appealing idea and an imperfect rationalist falls to it. If there's a true hole in the textbook's information, such that it produces unfriendly AI instead of friendly, and the AI who wrote the textbook handwaved that hole away, how confident are you that you would spot the best hand-waving ever written?

Comment author: scav 20 December 2010 03:35:37PM 2 points [-]

Not confident at all. In fact I have seen no evidence for the possibility, even in principle, of provably friendly AI. And if there were such evidence, I wouldn't be able to understand it well enough to evaluate it.

In fact I wouldn't trust such a textbook even written by human experts whose motives I trusted. The problem isn't proving the theorems, it's choosing the axioms.