loqi comments on The AI in a box boxes you - Less Wrong

102 Post author: Stuart_Armstrong 02 February 2010 10:10AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (378)

You are viewing a single comment's thread. Show more comments above.

Comment author: loqi 05 February 2010 06:49:59PM *  3 points [-]

Understanding the formal definitions involved is not enough. Humans have to be smart enough to independently verify that they map to the actual implementation.

This is basically what I mean by "understanding" them. Otherwise, what's to understand? Would you claim that you "understand set theory" because you've memorized the axioms of ZFC?

I believe that, in this case, "dumb" is fully general. No human-understandable proof checkers would be powerful enough to reliably check the AI's proof.

This intuition is very alien to me. Can you explain why you believe this? Proof checkers built up from relatively simple trusted kernels can verify extremely large and complex proofs. Since the AI's goal is for the human to understand the proof, it seems more like a test of the AI's ability to compile proofs down to easily machine-checkable forms than it is the human's ability to understand the originals. Understanding the definitions is the hard part.

Comment author: aausch 07 February 2010 10:30:12PM *  0 points [-]

This intuition is very alien to me. Can you explain why you believe this? Proof checkers built up from relatively simple trusted kernels can verify extremely large and complex proofs. Since the AI's goal is for the human to understand the proof, it seems more like a test of the AI's ability to compile proofs down to easily machine-checkable forms than it is the human's ability to understand the originals. Understanding the definitions is the hard part.

A different way to think about this that might help you see the problem from my point of view, is to think of proof checkers as checking the validity of proofs within a given margin of error, and within a range of (implicit) assumptions. How accurate does a proof checker have to be - how far do you have to mess with bult in assumptions for proof checkers (or any human-built tool) before they can no longer be thought of as valid or relevant? If you assume a machine which doubles both its complexity and its understanding of the universe at sub-millisecond intervals, how long before it will find the bugs in any proof checker you will pit it against?

Comment author: loqi 07 February 2010 11:51:37PM 0 points [-]

"If" is the question, not "how long". And I think we'd stand a pretty good chance of handling a proof object in a secure way, assuming we have a secure digital transmission channel etc.

But the original scope of the thought experiment was assuming that we want to verify the proof. Wei Dai said:

Surely most humans would be too dumb to understand such a proof? And even if you could understand it, how does the AI convince you that it doesn't contain a deliberate flaw that you aren't smart enough to find? Or even better, you can just refuse to look at the proof.

I was responding to the first question, exclusively disjoint from the others. If your point is that we shouldn't attempt to verify an AI's precommitment proof, I agree.

Comment author: aausch 09 February 2010 10:19:41PM 0 points [-]

I'm getting more confused. To me, the statements "Humans are too dumb to understand the proof" and the statement "Humans can understand the proof given unlimited time", where 'understand' is qualified to include the ability to properly map the proof to the AI's capabilities, are equivalent.

My point is not that we shouldn't attempt to verify the AI's proof for any external reasons - my point is that there is no useful information to be gained from the attempt.