cousin_it comments on A model of UDT without proof limits - Less Wrong

13 Post author: cousin_it 20 March 2012 07:41PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (37)

You are viewing a single comment's thread. Show more comments above.

Comment author: cousin_it 22 March 2012 12:28:37PM *  0 points [-]

U does not receive an action as an argument, U is a program with no arguments that includes A as a subprogram.

Comment author: twanvl 22 March 2012 01:06:10PM 0 points [-]

But you can easily rewrite U() as U'(A'). I think that solves all the problems with self-referential A. At least this works for the example in this post. How would U look for, say, Newcomb's problem?

def U(A):
if predict(A) == 1:
box1 = 1000, box2 = 1000000
else:
box1 = 1000, box2 = 0
if A() == 1:
return box2
else:
return box1+box2

still works fine if A is a parameter of U.

Comment author: cousin_it 22 March 2012 01:20:05PM *  1 point [-]

One of the motivations for our branch of decision theory research is that the physics of our world looks more like an argumentless function containing many logical correlates of you than like a function receiving you as an argument. Of course having a general way to derive one representation from the other would be great news for us. You can view the post as supplying one possible way to do just that :-)

Comment author: twanvl 22 March 2012 01:42:28PM 0 points [-]

Mathematically, the two forms are equivalent as far as I can tell. For any U that contains A, you can rewrite it to U=U'(A). If A is a part of U, then in order to proof things about U, A would have to simulate or proof things about itself, which leads to all kinds of problems. However, if A knows that U=U'(A), it can consider different choices x in U'(x) without these problems.

Comment author: cousin_it 22 March 2012 01:52:44PM *  1 point [-]

For any U that contains A, you can rewrite it to U=U'(A).

...In many different ways. For example, you could rewrite it to a U' that ignores its argument and just repeats the source code of U. Or if U contains one original copy of A and one slightly tweaked but logically equivalent copy, your rewrite might pick up on one but not the other, so you end up defecting in the Prisoner's Dilemma against someone equivalent to yourself. To find the "one true rewrite" from U to U', you'll need the sort of math that we're developing :-)

Also, do you mean U()=U'(A) (dependence on source code) or U()=U'(A()) (dependence on return value)? There's a subtle difference, and the form U()=U'(A) is much harder to reason about, because U'(X) could do all sorts of silly things when X!=A and still stay equivalent to U when X=A.