Comment author: Wei_Dai 05 February 2010 11:44:38AM *  2 points [-]

I don't see why this is outside of UDT's domain. It seems straightforward to model and solve the decision problem in UDT1. Here's the world program:

def P(color):
outcome = "die"
if Omega_Predict(S, "you're wrong") == color:
if S("") == color:
outcome = "live"
else:
if S("you're wrong") == color:
outcome = "live"

Assuming a preference to maximize the occurrence of outcome="live" averaged over P("green") and P("red"), UDT1 would conclude that the optimal S returns a constant, either "green" or "red", and do that.

BTW, do you find this "world program" style analysis useful? I don't want to over-do them and get people annoyed. (I refrained from doing this for the problem described in Gary's post, since it doesn't mention UDT at all, and therefore I'm assuming you want to find a TDT-only solution.)

Comment author: Gary_Drescher 05 February 2010 03:07:55PM 2 points [-]

(I refrained from doing this for the problem described in Gary's post, since it doesn't mention UDT at all, and therefore I'm assuming you want to find a TDT-only solution.)

Yes, I was focusing on a specific difficulty in TDT, But I certainly have no objection to bringing UDT into the thread too. (I myself haven't yet gotten around to giving UDT the attention I think it deserves.)

Comment author: Eliezer_Yudkowsky 05 February 2010 02:49:52AM 1 point [-]

By "unsolvable" I mean that you're screwed over in final outcomes, not that TDT fails to have an output.

The interesting part of the problem is that, whatever you decide, you deduce facts about the background such that you know that what you are doing is the wrong thing. However, if you do anything differently, you would have to make a different deduction about the background facts, and again know that what you were doing was the wrong thing. Since we don't believe that our decision is capable of affecting the background facts, the background facts ought to be a fixed constant, and we should be able to alter our decision without affecting the background facts... however, as soon as we do so, our inference about the unalterable background facts changes. It's not 100% clear how to square this with TDT.

Comment author: Gary_Drescher 05 February 2010 02:19:18PM *  0 points [-]

By "unsolvable" I mean that you're screwed over in final outcomes, not that TDT fails to have an output.

Oh ok. So it's unsolvable in the same sense that "Choose red or green. Then I'll shoot you." is unsolvable. Sometimes choice really is futile. :) [EDIT: Oops, I probably misunderstood what you're referring to by "screwed over".]

The interesting part of the problem is that, whatever you decide, you deduce facts about the background such that you know that what you are doing is the wrong thing.

Yes, assuming that you're the sort of algorithm that can (without inconsistency) know its own choice here before the choice is executed.

If you're the sort of algorithm that may revise its intended action in response to the updated deduction, and if you have enough time left to perform the updated deduction, then the (previously) intended action may not be reliable evidence of what you will actually do, so it fails to provide sound reason for the update in the first place.

Comment author: wedrifid 05 February 2010 07:04:26AM *  0 points [-]

Let:

  • M be 'There is $1 in the big box'

When:

  • D(M) = true, D(!M) = true, E = true

Omega fails.

  • D(M) = true, D(!M) = true, E = false

Omega chooses M or !M. I get $1M or 0.

  • D(M) = true, D(!M) = false, E = true

Omega chooses M=false. I get $0.1.

  • D(M) = true, D(!M) = false, E = false

Omega chooses M=true. I get $1M.

  • D(M) = false, D(!M) = false, E = true

M chooses either M or !M. I get either $1.1 or $0.1 depending on Omega's whims

  • D(M) = false, D(!M) = false, E = false

Omega has no option. I make Omega look like a fool.

So, depending on how 'Omega is wrong' is resolved I use either D(M) = M or D(M) = false.

  • If Omega is just infallible then when D(M) = false, !E just never happens and I get either $0.1M or $1.1M depending on Omega's whims. Since I'm being a smart ass I probably get $0.1M. So I use D(M) = M and get expected payout of $0.91M.
  • If Omega resolves "I am wrong" to "I give maximum payout" then I choose D(M) = false and get $1.1M (or sometimes either $1.1 or $0.1).
  • If Omega resolves "I am wrong" to "I give minimum payout" then I once again get $0.1M when D(M) = false and E.

These are the conclusions of Wedrifid-Just-Works-It-Out Decision Theory. It should match TDT when TDT is formulated right (and I don't make a mistake).

Comment author: Gary_Drescher 05 February 2010 01:46:43PM 1 point [-]

When:

D(M) = true, D(!M) = true, E = true

Omega fails.

No, but it seems that way because I neglected in my OP to supply some key details of the transparent-boxes scenario. See my new edit at the end of the OP.

Comment author: rwallace 05 February 2010 04:16:13AM 0 points [-]

In the setup in question, D goes into an infinite loop (since in the general case it must call a copy of C, but because the box is transparent, C takes as input the output of D).

In Eliezer's similar red/green problem, if the simulation is fully deterministic and the initial conditions are the same, then the simulator must be lying, because he must've told the same thing to the first instance, at a time when there had been no previous copy. (If those conditions do not hold, then the solution is to just flip a coin and take your 50-50 chance.)

Are these still problems when you change them to fix the inconsistencies?

Comment author: Gary_Drescher 05 February 2010 01:31:25PM *  0 points [-]

In the setup in question, D goes into an infinite loop (since in the general case it must call a copy of C, but because the box is transparent, C takes as input the output of D).

No, because by stipulation here, D only simulates the hypothetical case in which the box contains $1M, which does not necessarily correspond to the output of D (see my earlier reply to JGWeissman:

http://lesswrong.com/lw/1qo/a_problem_with_timeless_decision_theory_tdt/1kpk).

Comment author: JGWeissman 05 February 2010 01:27:39AM *  2 points [-]

I think this problem is based (at least in part) on an incoherence in the basic transparent box variant of Newcomb's problem.

If the subject of the problem will two-box if he sees the big box has the million dollars, but will one-box if he sees the big box is empty. Then there is no action Omega could take to satisfy the conditions of the problem.

In this variant that introduces the digit of pi, there is an unknown bit such that whatever strategy the subject takes, there is a value of that bit that allows Omega an action consistant with the conditions. However, that does not mean the bit actually has that value, it may in fact have the other value and the problem still is not coherent.

I suspect that there is still something this says about TDT, but I am not sure how to illustrate it with an example that does not also have the problem I have described.

Edit As I was typing this, Eliezer posted his reply, including "an unsolvable problem that should stay unsolvable" that should stay unsolved which is equivalent to the problem I have described.

Comment author: Gary_Drescher 05 February 2010 02:12:29AM *  3 points [-]

I think this problem is based (at least in part) on an incoherence in the basic transparent box variant of Newcomb's problem.

If the subject of the problem will two-box if he sees the big box has the million dollars, but will one-box if he sees the big box is empty. Then there is no action Omega could take to satisfy the conditions of the problem.

The rules of the transparent-boxes problem (as specified in Good and Real) are: the predictor conducts a simulation that tentatively presumes there will be $1M in the large box, and then puts $1M in the box (for real) iff the simulation showed one-boxing. So the subject you describe gets an empty box and one-boxes, but that doesn't violate the conditions of the problem, which do not require the empty box to be predictive of the subject's choice.

Comment author: Eliezer_Yudkowsky 05 February 2010 01:16:44AM 7 points [-]

And this was my reply:

This is an unfinished part of the theory that I've also thought about, though your example puts it very crisply (you might consider posting it to LW?)

My current thoughts on resolution tend to see two main avenues:

1) Construct a full-blown DAG of math and Platonic facts, an account of which mathematical facts make other mathematical facts true, so that we can compute mathematical counterfactuals.

2) Treat differently mathematical knowledge that we learn by genuinely mathematical reasoning and by physical observation. In this case we know (D xor E) not by mathematical reasoning, but by physically observing a box whose state we believe to be correlated with D xor E. This may justify constructing a causal DAG with a node descending from D and E, so a counterfactual setting of D won't affect the setting of E.

Currently I'd say that (2) looks like the better avenue. Can you come up with an improper mathematical dependency where E is inferred from D, and shouldn't be seen as counterfactually affected, based on mathematical reasoning only without postulating the observation of a physical variable that descends from both E and D?

Incidentally, note that an unsolvable problem that should stay unsolvable is as follows: I'm asked to pick red or green, and told "A simulation of you given this information as well picked the wrong color and got shot." Whichever choice I make, I deduce that the other choice was better. The exact details here will depend on how I believe the simulator chose to tell me this, but ceteris paribus it's an unsolvable problem.

Comment author: Gary_Drescher 05 February 2010 01:53:20AM *  3 points [-]

For now, let me just reply to your incidental concluding point, because that's brief.

I disagree that the red/green problem is unsolvable. I'd say the solution is that, with respect to the available information, both choices have equal (low) utility, so it's simply a toss-up. A correct decision algorithm will just flip a coin or whatever.

Having done so, will a correct decision algorithm try to revise its choice in light of its (tentative) new knowledge of what its choice is? Only if it has nothing more productive to do with its remaining time.

Comment author: whpearson 05 February 2010 01:04:38AM 2 points [-]

I'm in the same confused camp as Laura. This paragraph confuses me.

So E does indeed "depend on" C, in the particular sense you've specified. Thus, if I happen to have a strong enough preference that E output True, then TDT (as currently formulated) will tell me to two-box for the sake of that goal. But that's the wrong decision, of course. In reality, I have no choice about the specified digit of pi.

Why is it the wrong decision? If Omega can perfectly predict the TDT and TDT sees 1 million dollars, then the TDT must be in a world that the ith digit of PI is 0. It is an unlikely world, to be sure.

Comment author: Gary_Drescher 05 February 2010 01:19:07AM 2 points [-]

Actually, you're in a different camp than Laura: she agrees that it's incorrect to two-box regardless of any preference you have about the specified digit of pi. :)

The easiest way to see why two-boxing is wrong is to imagine a large number of trials, with a different chooser, and a different value of i, for each trial. Suppose each chooser strongly prefers that their trial's particular digit of pi be zero. The proportion of two-boxer simulations that end up with the digit equal to zero is no different than the proportion of one-boxer simulations that end up with the digit equal to zero (both are approximately .1). But the proportion of the one-boxer simulations that end up with an actual $1M is much higher (.9) than the proportion of two-boxer simulations that end up with an actual $1M (.1).

Comment author: LauraABJ 05 February 2010 12:37:57AM 0 points [-]

Yes- but your two-boxing didn't cause i=0, rather the million was there because i=0. I'm saying that if (D or E) = true and you get a million dollars, and you two-box, then you haven't caused E=0. E=0 before you two boxed, or if it did not, then omega was wrong and thought D = onebox, when in fact you are a two-boxer.

Comment author: Gary_Drescher 05 February 2010 12:48:07AM *  2 points [-]

Everything you just said is true.*

Everything you just said is also consistent with everything I said in my original post.

*Except for one typo: you wrote (D or E) instead of (D xor E).

Comment author: LauraABJ 05 February 2010 12:16:27AM *  1 point [-]

No, I still don't get why adding in the ith digit of pi clause changes Newcome's problem at all. If omega says you'll one-box and you two-box then omega was wrong, plain and simple. The ith digit of pi is an independent clause. I don't see how one's desire to make i=0 by two-boxing after already getting the million is any different than one wanting to make omega wrong by two-boxing after getting the million. If you are the type of person who, after getting the million thinks, "Gee, I want i=0! I'll two-box!" Then omega wouldn't have given you the million to begin with. After determining that he would not give you the million, he'd look at the ith digit of pi and either put the million in or not. You two-boxing has nothing to do with i.

Comment author: Gary_Drescher 05 February 2010 12:26:19AM 1 point [-]

If D=false and E=true and there's $1M in the box and I two-box, then (in the particular Newcomb's variant described above) the predictor is not wrong. The predictor correctly computed that (D xor E) is true, and set up the box accordingly, as the rules of this particular variant prescribe.

Comment author: LauraABJ 04 February 2010 11:39:36PM 2 points [-]

I'm not clear at all what the problem is, but it seems to be symantic. It's disturbing that this post can get 17 upvotes with almost no (2?) comments actually referring to what you're saying- indicating that no one else here really gets the point either.

It seems you have an issue with the word 'dependent' and the definition that Eliezer provided. Under that definition, E (the ith digit of pi) would be dependent on C (our decision to one or two box) if we two-boxed and got a million dollars, because then we would know that E = 0, and we would not have known this if we had not two-boxed. So we can infer E from C, thus dependency. By Eliezer's definition, which seems to be a special information-theoretical definition, I see no problem with this conclusion. The problem only seems to arise if you then take the intuitive definition of the word 'dependent' as meaning 'contingent upon,' as in 'Breaking the egg is contingent upon my dropping it.' Your symantic complain goes beyond newcome- by Eliezer's definition of 'dependent,' the pH of water (E) is dependent upon our litmus testing it, since the result of the litmus test (C) allows us to infer the water's actual pH. C lets us infer E, thus dependency.

Comment author: Gary_Drescher 04 February 2010 11:51:03PM *  5 points [-]

Sorry, the above post omits some background information. If E "depends on" C in the particular sense defined, then the TDT algorithm mandates that when you "surgically alter" the output of C in the factored causal graph, you then you must correspondingly surgically alter the output of E in the graph.

So it's not at all a matter of any intuitive connotation of "depends on". Rather, "depends on", in this context, is purely a technical term that designates a particular test that the TDT algorithm performs. And the algorithm's prescribed use of that test culminates in the algorithm making the wrong decision in the case described above (namely, it tells me to two-box when I should one-box).

View more: Prev | Next