orthonormal comments on An example of self-fulfilling spurious proofs in UDT - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (39)
Excellent! I wonder how far the malicious behavior can be extended.
Here the problem is that A directly uses a valid but malicious inference module Q. If A were built to enumerate proofs by length and act on the first one of the form "A()==a implies U()==u, and A()!=a implies U()<=u", can U be rewritten to guarantee that a particular spurious proof comes up first?
Or if A uses a halting oracle to check through statements of that form, can U be written (incorporating that oracle) so that the halting oracle fails to return for the genuine counterfactual and only returns "True" for a particular spurious one?
These are reasonable questions. I don't know the answers. Would be cool if someone else figured them out ;-)
My next attempt :) Below, "PA" means Peano Arithmetic.
Proposition: A() returns 1 (if PA is consistent).
Proof:
Let X = "there exist a!=1 and u, such that: A()==a => U()==u and A()!=a => U()<=u".
Let S = "there exists u, such that: A()==1 => U()==u and A()!=1 => U()<u".
Let T = "A()==1".
Let ProvableU(P) = the provability formula of U (PA+Consistent(PA), with proofs of length < N).
Let ProvableA(P) = the provability formula of A (PA).
Let "U⊦ P" mean P is a theorem of U's proof system.
Then:
QED
Sorry for not noticing your comment earlier. It would be nice to have a way to subscribe to comment threads.
I don't understand why the definitions of X and S imply that S => ~X. Does that rely on specific features of your A and U? For example, if we take a different A and U so that A()==2 and U()==0, then X is true for a=2 and u=0, and S is true for u=1.
Use "Subscribe to RSS Feed" on the right.
Here's another interesting followup question. My implementation of Q relies on enumerating all proofs in order, so running Q requires exponential time. Is there a "Henkin-style" implementation of Q that constructs the self-referential proof quickly and directly, maybe using the steps of Löb's theorem or something? That's how I first tried to construct Q after reading your comments, and failed, but it might still be possible.
Maybe something like this:
where IsValidProof(Statement, Proof) is Goedel's Bew function, Proof1 is a formalization of the argument that "if Q() returns a valid proof of X, then A() will return 1, and X will be true", and Proof2 is a formalization of the argument that "a valid proof of A=>B, followed by a valid proof of A, followed by B, is a valid proof".
The proof of the statement S: "A()==1 implies U()==5, and A()!=1 implies U()<=5" exists, so it will be found by both A() and U(). Any proof of another statement of the same form would imply "there was no proof of length <N of statement S", which would be false given high enough N, so it couldn't be found.
I can't parse your source code for U, does it return a value in every case?
To write code in comments, prefix lines with four spaces, like this:
Thanks, edited.
Why is S provable? As far as I can see, you can't easily prove it by Löb's theorem, because "if S is provable then S" is not apparent by inspection of A and U, because A could find a different proof earlier.
I thought it is similar to your proof here.
In Lobian cooperation, the agents search for proofs of only one statement, never stopping early because they found a proof of something else. Your implementation of A doesn't seem to work like that. Or did I misunderstand and your version of A only looks for proofs where A()==1?
I see what you're trying to do, but can you explain why A would return 1?
The intuition is that the proof of "A()==1 implies U()==5, and A()!=1 implies U()<=5", if it exists, would not depend on N, whereas any proof of "A()==2 implies U()==10..." would have to be longer than N, so by making N large enough...
The setup should make "if S has a proof of length < N, then S" apparent by inspection, answering your earlier objection: if S is (<N)-provable, then U() will find the proof (because finding any other proof first would imply U() is unsound), and then return (A()==1 ? 5 : 0), which requires A() to return 1, otherwise U() is unsound again.
I thought it couldn't find any other proofs of length < N, because it would imply there was no proof of S. But this is not a problem if S is false... Ok, modification:
EDIT: Wait, this is not good, now if(A()==2) is unreachable...
EDIT2: No, not actually unreachable, but any proof for a statement of the form "A()==2 => U()==10..." must be of length > N, which is what was needed, I suppose. Still feels like cheating, but I'm not sure why...
What's the intended consequence of A()==2 in your implementation of U? Is it U()==0 or U()==10? And which of those would actually happen?