# endoself comments on AI reflection problem - Less Wrong Discussion

4 [deleted] 27 November 2011 06:29AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Sort By: Best

Comment author: 13 December 2011 11:43:15PM *  0 points [-]

PA + ¬Cons(PA) ⊢ ◻¬Cons(PA)

I don't think this is quite what you meant. This means PA + ¬Cons(PA) allows you to prove that you can prove ¬Cons(PA), where the second use of the word 'prove' does not make reference to any specific formal system. The turnstile refers to the specific theory that is being used, here PA + ¬Cons(PA), and the box symbol is used when the meaning is clear from context, but you've used both.

Our difference is between requiring existential proofs (which are not interesting to me, since things can be provable and yet never proved) and constructive proofs (a set whose membership is undecidable, yet which, by showing equivalence between the two systems or between the first and a subsystem of the second, we can accept). Again however, this is not a real limitation, because the constructive proof of a statement lets us infer that the statement is existentially provable. Thus we recover the "sensible" existentially provable statements, and discard the ones which are provable and yet unprovable.

This seems like a description of some constructive logic version of PA. I'm not sure exactly how that would work or if there is a standard constructive axiomatization of the natural numbers, but the set of theorems of constructive logic is a subset of those of classical logic, so this will not be able to prove that some given axiom system is trustable if the original system cannot do so.

You could add constructive theorems if you also add axioms, which I'm not sure whether you're proposing. For example, given a universal Turing machine U and a number x, there is a shortest program p such that U(p) = x and it could be the case, depending on U and x, that PA can prove that such a p exists, but not what it is. Then, by adding axioms that U never halts on certain inputs, it can be possible to deduce the value of p.

I don't see how that could help with the reflection problem though. Do you propose adding soundness of some formal system as an axiom, but preventing Loeb's theorem from applying by switching to constructive logic? I think Loeb's theorem is still valid constructively; it doesn't use negation, but I'm not sure if that is a sufficient criterion to be provable constructively.

Comment author: [deleted] 14 December 2011 01:15:07AM *  0 points [-]

It's called Heyting arithmetic, but no, nowhere above do I mean intuitionistic logic when I say constructive. Or I don't think I do. I recognize Löb's theorem and am not trying restrict any system's expressive power to avoid it (i.e. using Dan Willard's self verifying theories).

Maybe I'll try to prove some sufficient conditions for a Gödel machine to self modify in a toy universe given this perspective. That should make it more convincing / concrete.