Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Wei_Dai comments on Q&A with new Executive Director of Singularity Institute - Less Wrong

26 Post author: lukeprog 07 November 2011 04:58AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (177)

You are viewing a single comment's thread. Show more comments above.

Comment author: Wei_Dai 15 November 2011 10:08:25PM *  26 points [-]

What I'm afraid of is that a design will be shown to be safe, and then it turns out that the proof is wrong, or the formalization of the notion of "safety" used by the proof is wrong. This kind of thing happens a lot in cryptography, if you replace "safety" with "security". These mistakes are still occurring today, even after decades of research into how to do such proofs and what the relevant formalizations are. From where I'm sitting, proving an AGI design Friendly seems even more difficult and error-prone than proving a crypto scheme secure, probably by a large margin, and there is no decades of time to refine the proof techniques and formalizations. There's good recent review of the history of provable security, titled Provable Security in the Real World, which might help you understand where I'm coming from.

Comment author: cousin_it 16 November 2011 02:23:16PM *  8 points [-]

Your comment has finally convinced me to study some practical crypto because it seems to have fruitful analogies to FAI. It's especially awesome that one of the references in the linked article is "An Attack Against SSH2 Protocol" by W. Dai.

Comment author: gwern 17 November 2011 01:24:34AM 5 points [-]
Comment author: John_Maxwell_IV 23 March 2012 06:51:19AM 3 points [-]

From where I'm sitting, proving an AGI design Friendly seems even more difficult and error-prone than proving a crypto scheme secure, probably by a large margin, and there is no decades of time to refine the proof techniques and formalizations.

Correct me if I'm wrong, but it doesn't seem as though "proofs" of algorithm correctness fail as frequently as "proofs" of cryptosystem unbreakableness.

Where does your intuition that friendliness proofs are on the order of reliability of cryptosystem proofs come from?

Comment author: Wei_Dai 23 March 2012 07:07:14AM 9 points [-]

Interesting question. I guess proofs of algorithm correctness fail less often because:

  1. It's easier to empirically test algorithms to weed out the incorrect ones, so there are fewer efforts to prove conjectures of correctness that are actually false.
  2. It's easier to formalize what it means for an algorithm to be correct than for a cryptosystem to be secure.

In both respects, proving Friendliness seems even worse than proving security.

Comment author: CarlShulman 15 November 2011 10:25:41PM 1 point [-]

What I'm afraid of is that a design will be shown to be safe, and then it turns out that the proof is wrong, or that the formalization of the notion of "safety" used by the proof is wrong.

Thanks for clarifying.

This kind of thing happens a lot in cryptography,

I agree.