gRR comments on A model of UDT with a halting oracle - Less Wrong

41 Post author: cousin_it 18 December 2011 02:18PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (100)

You are viewing a single comment's thread. Show more comments above.

Comment author: gRR 10 March 2012 06:38:44PM *  0 points [-]

Ok, modification:

def U():
box1 = 1000
box2 = (A_1() == 2) ? 0 : 1000000
return box2 + ((A_2() == 2) ? box1 : 0)

or

def U():
if(A_1() == 'C' and A_2() == 'C') return 2;
if(A_1() == 'D' and A_2() == 'C') return 3;
...

def A():
Forall i try to prove A()==A_i()
Let UX(X) be the function that results from the source of U() if all calls "A_i()", for which the system was able to prove A()==A_i() were changed to "eval(X)".
For some action p, prove that for any possible action q, UX("return q") <= UX("return p").
Return p.

What results should be equivalent to TDT, I think.

Comment author: [deleted] 14 March 2012 06:35:18PM 0 points [-]

If every step in that code is valid, then that would work. In my opinion the shakiest step in your algorithm is the "Forall i" in the first line of A(). I'll probably reason myself into a hole if I try to figure out whether or not it's valid.

Comment author: gRR 14 March 2012 07:07:22PM 0 points [-]

Well, at least in some cases (e.g., if A's code is equal to A_i's code) the proof is immediate. Proof is also possible for minor obvious modifications, like variable renaming, so the agent should still behave better than a quining agent for PD.

My goal for inventing this is to have a consistent agent that can know (can prove) its decision beforehand, at least sometimes, without getting into contradictions...