You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Squark comments on Identity and quining in UDT - Less Wrong Discussion

9 Post author: Squark 17 March 2015 08:01PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (26)

You are viewing a single comment's thread. Show more comments above.

Comment author: Squark 18 March 2015 09:11:38AM *  4 points [-]

Hi Manfred, thx for commenting!

...what this is is a constructive proof that any decision algorithm has to lose on some problems, because Omega could diagonalize against any algorithm, and then any agent implementing that algorithm is hosed.

See my reply to KnaveOfAllTrades.

As for XDT, I don't see why it shouldn't get the $1M when playing an anti-Newcomb problem. 1,000,000 is bigger than 1,000, after all.

Which anti-Newcomb problem? In the XDT anti-Newcomb problem, 1000 is the maximal payoff. No decision theory gets more. In the UDT anti-Newcomb problem, XDT gets 1,000,1000 while UDT remains with 1,000,000.

...we're still waiting on a reasonable system of logical counterfactuals. 3, 4', and if I understand it right, 5, are just not addressing that core problem.

Well, there is more than one remaining problem :) Regarding logical counterfactuals, I think that the correct approach is going to be via complexity theory. Hope to write about it later, in the meanwhile you can check out this. By now I discovered some problems with the formalism I used there (and a possible path to fixing them), but I think the general direction is right.

Comment author: Manfred 18 March 2015 07:31:31PM 2 points [-]

As for XDT, I don't see why it shouldn't get the $1M when playing an anti-Newcomb problem. 1,000,000 is bigger than 1,000, after all.

Which anti-Newcomb problem? In the XDT anti-Newcomb problem, 1000 is the maximal payoff. No decision theory gets more.

Right, the XDT ANP. Because this is in fact a decision-controlled problem, only from the perspective of an XDT agent. And so they can simply choose to receive $1M on this problem if they know that that's what they're facing. $1M being bigger than $1000, I think they should do so.

But you do raise a good point, which is that there might be some way to avoid being beaten by other agents on decision-controlled problems, if you give up on maximizing payoff. It might depend on what metric of success you optimize the the decision procedure for. If you take the view logically upstream of filling the boxes, the maximum is $1.001M, and success is relative to that. If you take the view downstream, you might be satisfied with $1000 because that's the maximum.

Comment author: Squark 19 March 2015 06:07:32PM 0 points [-]

Right, the XDT ANP. Because this is in fact a decision-controlled problem, only from the perspective of an XDT agent.

It is decision-determined from the perspective of any agent. The payoff only depends on the agent's decision: namely, it's 1000$ for two-boxing and 0$ for one-boxing.

And so they can simply choose to receive $1M on this problem if they know that that's what they're facing. $1M being bigger than $1000, I think they should do so.

Look on the problem from the perspective of the precursor. The precursor knows XDT two-boxes on the problem. There is no way to change this fact. So one box is going to be empty. Therefore building an XDT agent in this situation is no worse than building any other agent.

Comment author: Manfred 19 March 2015 08:08:33PM *  1 point [-]

It is decision-determined from the perspective of any agent. The payoff only depends on the agent's decision: namely, it's 1000$ for two-boxing and 0$ for one-boxing.

Yeah, sorry, I misspoke. The contents of the boxes are controlled by the agent's decision, only for an XTD agent.

Look on the problem from the perspective of the precursor. The precursor knows XDT two-boxes on the problem. There is no way to change this fact. So one box is going to be empty. Therefore building an XDT agent in this situation is no worse than building any other agent.

I am using XDT here in the sense of "the correct decision algorithm (whatever it is)." An XDT agent, if faced with the XDT-anti-Newcomb-problem, can, based on its decision, either get $1M, or $1k. If it takes the $1M, it loses in the sense that it does worse on this problem than a CDT agent. If it takes the $1k, it loses in the sense that it just took $1k over $1M :P

And because of XDT's decision controlling the contents of the box, when you say "the payoff is $1000 for two-boxing and $0 for one-boxing," you're begging the question about what you think the correct decision algorithm should do.

Comment author: Squark 22 March 2015 07:47:48PM 0 points [-]

And because of XDT's decision controlling the contents of the box, when you say "the payoff is $1000 for two-boxing and $0 for one-boxing," you're begging the question about what you think the correct decision algorithm should do.

The problem is in the definition of "correct". From my point of view, "correct" decision algorithm means the algorithm that a rational precursor should build. That is, it is the algorithm instantiating which by the precursor will yield at least as much payoff as instantiating any other algorithm.

Comment author: Manfred 22 March 2015 11:04:59PM 0 points [-]

Well, I agree with you there :P But I think you're cashing this out as the fixed point of a process, rather than as the maximization I am cashing it out as.