All of Gary_Drescher's Comments + Replies

For the simulation-output variant of ASP, let's say the agent's possible actions/outputs consist of all possible simulations Si (up to some specified length), concatenated with "one box" or "two boxes". To prove that any given action has utility greater than zero, the agent must prove that the associated simulation of the predictor is correct. Where does your algorithm have an opportunity to commit to one-boxing before completing the simulation, if it's not yet aware that any of its available actions has nonzero utility? (Or would that commitment require a

... (read more)
0AlexMennen
simulation-output: It would require a modification to the algorithm. I don't find this particularly alarming, though, since the algorithm was intended as a minimally-complex solution that behaves correctly for good reasons, not as a final, fully-general version. To do this, the agent would have to first (or at least, at some point soon enough for the predictor to simulate) look for ways to partition its output into pieces and consider choosing each piece separately. There would have to be some heuristic for deciding what partitionings of the output to consider and how much computational power to devote to each of them, and then which one actually gets chosen depends on which has the highest resulting utility you expect to get from them. Come to think of it, this might be trickier than I was thinking because you would run into self-trust issues if you need to prove that you will output the correct simulation of the predictor. This could be fixed by delegating the task of fully simulating the predictor to an easier-to-model subroutine, though that would require further modification to the algorithm. Simulation-as-key: I don't have a good answer to that.

Suppose we amend ASP to require the agent to output a full simulation of the predictor before saying "one box" or "two boxes" (or else the agent gets no payoff at all). Would that defeat UDT variants that depend on stopping the agent before it overthinks the problem?

(Or instead of requiring the the agent to output the simulation, we could use the entire simulation, in some canonical form, as a cryptographic key to unlock an encrypted description of the problem itself. Prior to decrypting the description, the agent doesn't even know what the rules are; the agent is told in advance only that that decryption will reveal the rules.)

0Gary_Drescher
For the simulation-output variant of ASP, let's say the agent's possible actions/outputs consist of all possible simulations Si (up to some specified length), concatenated with "one box" or "two boxes". To prove that any given action has utility greater than zero, the agent must prove that the associated simulation of the predictor is correct. Where does your algorithm have an opportunity to commit to one-boxing before completing the simulation, if it's not yet aware that any of its available actions has nonzero utility? (Or would that commitment require a further modification to the algorithm?) For the simulation-as-key variant of ASP, what principle would instruct a (modified) UDT algorithm to redact some of the inferences it has already derived?
0AlexMennen
In the first problem, the agent could commit to one-boxing (through the mechanism I described in the link) and then finish simulating the predictor afterwards. Then the predictor would still be able to simulate the agent until it commits to one-boxing, and then prove that the agent will one-box no matter what it computes after that. The second version of the problem seems more likely to cause problems, but it might work for the agent to restrict itself to not using the information it pre-computed for the purposes of modeling the predictor (even though it has to use that information for understanding the problem). If predictor is capable of verifying or assuming that the agent will correctly simulate it, it could skip the impossible step of fully simulating the agent fully simulating it, and just simulate the agent on the decrypted problem.

According to information his family graciously posted to his blog, the cause of death was occlusive coronary artery disease with cardiomegaly.

http://blog.sethroberts.net/

1A1987dM
Does that make it more likely or less likely that his death was related to his diet?

It occurs to me that my references above to "coherence" should be replaced by "coherence & P(T)=1 & reflective consistency". That is, there exists (if I understand correctly) a P that has all three properties, and that assigns the probabilities listed above. Therefore, those three properties would not suffice to characterize a suitable P for a UDT agent. (Not that anyone has claimed otherwise.)

Wow, this is great work--congratulations! If it pans out, it bridges a really fundamental gap.

I'm still digesting the idea, and perhaps I'm jumping the gun here, but I'm trying to envision a UDT (or TDT) agent using the sense of subjective probability you define. It seems to me that an agent can get into trouble even if its subjective probability meets the coherence criterion. If that's right, some additional criterion would have to be required. (Maybe that's what you already intend? Or maybe the following is just muddled.)

Let's try invoking a coherent P i... (read more)

2Gary_Drescher
It occurs to me that my references above to "coherence" should be replaced by "coherence & P(T)=1 & reflective consistency". That is, there exists (if I understand correctly) a P that has all three properties, and that assigns the probabilities listed above. Therefore, those three properties would not suffice to characterize a suitable P for a UDT agent. (Not that anyone has claimed otherwise.)
9Benya
I've also tried applying this theory to UDT, and have run into similar 5-and-10-ish problems (though I hadn't considered making the reward depend on a statement like G, that's a nice trick!). My tentative conclusion is that the reflection principle is too weak to have much teeth when considering a version of UDT based on conditional expected utility, because for all actions A that the agent doesn't take, we have P(Agent() = A) = 0; we might still have P("Agent() = A") > 0 (but smaller than epsilon), but the reflection axioms do not need to hold conditional on Agent() = A, i.e., for X a reflection axiom we can have P assign positive probability to e.g. P("X & Agent() = A") / P("Agent() = A") < 0.9. But it's difficult to ask for more. In order to evaluate the expected utility conditional on choosing A, we need to coherently imagine a world in which the agent would choose A, and if we also asked the probability distribution conditional on choosing A to satisfy the reflection axioms, then choosing A would not be optimal conditional on choosing A -- contradiction to the agent choosing A... (We could have P("Agent() = A") = 0, but not if you have the agent playing chicken, i.e., play A if P("Agent() = A"); if we have such a chicken-playing agent, we can coherently imagine a world in which it would play A -- namely, a world in which P("Agent() = A") = 0 -- but this is a world that assigns probability zero to itself. To make this formal, replace "world" by "complete theory".) I think applying this theory to UDT will need more insights. One thing to play with is a formalization of classical game theory: * Specify a decision problem by a function from (a finite set of) possible actions to utilities. This function is allowed to be written in the full formal language containing P("."). * Specify a universal agent which takes a decision problem D(.), evaluates the expected utility of every action -- not in the UDT way of conditioning on Agent(D) = A, but by simply evaluatin

If John's physician prescribed a burdensome treatment because of a test whose false-positive rate is 99.9999%, John needs a lawyer rather than a statistician. :)

0lukeprog
True, that! :)

In April 2010 Gary Drescher proposed the "Agent simulates predictor" problem, or ASP, that shows how agents with lots of computational power sometimes fare worse than agents with limited resources.

Just to give due credit: Wei Dai and others had already discussed Prisoner's Dilemma scenarios that exhibit a similar problem, which I then distilled into the ASP problem.

and for an illuminating reason - the algorithm is only run with one set of information

That's not essential, though (see the dual-simulation variant in Good and Real).

0Manfred
Well, yeah, so long as all the decisions have defined responses.

Just to clarify, I think your analysis here doesn't apply to the transparent-boxes version that I presented in Good and Real. There, the predictor's task is not necessarily to predict what the agent does for real, but rather to predict what the agent would do in the event that the agent sees $1M in the box. (That is, the predictor simulates what--according to physics--the agent's configuration would do, if presented with the $1M environment; or equivalently, what the agent's 'source code' returns if called with the $1M argument.)

If the agent would one-box ... (read more)

0Manfred
Interesting. This would seem to return it to the class of decision-determined problems, and for an illuminating reason - the algorithm is only run with one set of information - just like how in Newcomb's problem the algorithm has only one set of information no matter the contents of the boxes. This way of thinking makes Vladimir's position more intuitive. To put words in his mouth, instead of being not decision determined, the "unfixed" version is merely two-decision determined, and then left undefined for half the bloody problem.

2) "Agent simulates predictor"

This basically says that the predictor is a rock, doesn't depend on agent's decision,

True, it doesn't "depend" on the agent's decision in the specific sense of "dependency" defined by currently-formulated UDT. The question (as with any proposed DT) is whether that's in fact the right sense of "dependency" (between action and utility) to use for making decisions. Maybe it is, but the fact that UDT itself says so is insufficient reason to agree.

[EDIT: fixed typo]

0Vladimir_Nesov
The arguments behind UDT's choice of dependence could prove strong enough to resolve this case as well. The fact that we are arguing about UDT's answer in no way disqualifies UDT's arguments. My current position on ASP is that reasoning used in motivating it exhibits "explicit dependence bias". I'll need to (and probably will) write another top-level post on this topic to improve on what I've already written here and on the decision theory list.

I assume (please correct me if I'm mistaken) that you're referring to the payout-value as the output of the world program. In that case, a P-style program and a P1-style program can certainly give different outputs for some hypothetical outputs of S (for the given inputs). However, both programs's payout-outputs will be the same for whatever turns out to be the actual output of S (for the given inputs).

P and P1 have the same causal structure. And they have the same output with regard to (whatever is) the actual output of S (for the given inputs). But P and... (read more)

0jimrandomh
This conversation is a bit confused. Looking back, P and P1 aren't the same at all; P1 corresponds to the case where Omega never asks you for any decision at all! If S must be equal to S1 and S1 is part of the world program, then S must be part of the world program, too, not chosen by the player. If choosing an S such that S!=S1 is allowed, then it corresponds to the case where Omega simulates someone else (not specified). The root of the confusion seems to be that Wei Dai wrote "def P(i): ...", when he should have written "def P(S): ...", since S is what the player gets to control. I'm not sure where making i a parameter to P came from, since the English description of the problem had i as part of the world-program, not a parameter to it.

My concern is that there may be several world-programs that correspond faithfully to a given problem description, but that correspond to different analyses, yielding different decision prescriptions, as illustrated by the P1 example above. (Upon further consideration, I should probably modify P1 to include "S()=S1()" as an additional input to S and to Omega_Predict, duly reflecting that aspect of the problem description.)

4jimrandomh
If there are multiple translations, then either the translations are all mathematically equivalent, in the sense that they agree on the output for every combination of inputs, or the problem is underspecified. (This seems like it ought to be the definition for the word underspecified. It's also worth noting that all game-theory problems are underspecified in this sense, since they contain an opponent you know little about.) Now, if two world programs were mathematically equivalent but a decision theory gave them different answers, then that would be a serious problem with the decision theory. And this does, in fact, happen with some decision theories; in particular, it happens to theories that work by trying to decompose the world program into parts, when those parts are related in a way that the decision theory doesn't know how to handle. If you treat the world-program as an opaque object, though, then all mathematically equivalent formulations of it should give the same answer.

That's very elegant! But the trick here, it seems to me, lies in the rules for setting up the world program in the first place.

First, the world-program's calling tree should match the structure of TDT's graph, or at least match the graph's (physically-)causal links. The physically-causal part of the structure tends to be uncontroversial, so (for present purposes) I'm ok with just stipulating the physical structure for a given problem.

But then there's the choice to use the same variable S in multiple places in the code. That corresponds to a choice (in TDT... (read more)

0Wei Dai
First, to clear up a possible confusion, the S in my P is not supposed to be a variable. It's a constant, more specifically a piece of code that implements UDT1 itself. (If I sometimes talk about it as if it's a variable, that's because I'm trying to informally describe what is going on inside the computation that UDT1 does.) For the more general question of how do we know the structure of the world program, the idea is that for an actual AI, we would program it to care about all possible world programs (or more generally, mathematical structures, see example 3 in my UDT1 post, but also Nesov's recent post for a critique). The implementation of UDT1 in the AI would then figure out which world programs it's in by looking at its inputs (which would contain all of the AI's memories and sensory data) and checking which world programs call it with those inputs. For these sample problems, the assumption is that somehow Omega has previously provided us with enough evidence for us to trust its word on what the structure of the current problem is. So in the actual P, 'S(i, "box contains $1M")' is really something like 'S(memories, omegas_explanations_about_this_problem, i, "box contains $1M")' and these additional inputs allow S to conclude that it's being invoked inside this P, and not some other world program. (An additional subtlety here is that if we consider all possible world programs, there are bound to be some other world programs where S is being called with these exact same inputs, for example ones where S is being instantiated inside a Boltzmann brain, but presumably those worlds/regions have very low weights, meaning that the AI doesn't care much about them.) Let me know if that answers your questions/concerns. I didn't answer you point by point because I'm not sure which questions/concerns remain after you see my general answers. Feel free to repeat anything you still want me to answer.

Ok. I think it would be very helpful to sketch, all in one place, what TDT2 (i.e., the envisioned avenue-2 version of TDT) looks like, taking care to pin down any needed sense of "dependency". And similarly for TDT1, the avenue-1 version. (These suggestions may be premature, I realize.)

The link between the Platonic decision C and the physical decision D

No, D was the Platonic simulator. That's why the nature of the C->D dependency is crucial here.

4Eliezer Yudkowsky
Okay, then we have a logical link from C-platonic to D-platonic, and causal links descending from C-platonic to C-physical, E-platonic to E-physical, and D-platonic to D-physical to F-physical = D-physical xor E-physical. The idea being that when we counterfactualize on C-platonic, we update D-platonic and its descendents, but not E-platonic or its descendents. I suppose that as written, this requires a rule, "for purposes of computing counterfactuals, keep in the causal graph rather than the logical knowledge base, any mathematical knowledge gained by observing a fact descended from your decision-output or any logical implications of your decision-output". I could hope that this is a special case of something more elegant, but it would only be hope.

No, but whenever we see a physical fact F that depends on a decision C/D we're still in the process of making plus Something Else (E),

Wait, F depends on decision computation C in what sense of “depends on”? It can't quite be the originally defined sense (quoted from your email near the top of the OP), since that defines dependency between Platonic computations, not between a Platonic computation and a physical fact. Do you mean that D depends on C in the original sense, and F in turn depends on D (and on E) in a different sense?

then we express our un

... (read more)
1Eliezer Yudkowsky
In my view, the chief form of "dependence" that needs to be discriminated is inferential dependence and causal dependence. If earthquakes cause burglar alarms to go off, then we can infer an earthquake from a burglar alarm or infer a burglar alarm from an earthquake. Logical reasoning doesn't have the kind of directionality that causation does - or at least, classical logical reasoning does not - there's no preferred form between ~A->B, ~B->A, and A \/ B. The link between the Platonic decision C and the physical decision D might be different from the link between the physical decision D and the physical observation F, but I don't know of anything in the current theory that calls for treating them differently. They're just directional causal links. On the other hand, if C mathematically implies a decision C-2 somewhere else, that's a logical implication that ought to symmetrically run backward to ~C-2 -> ~C, except of course that we're presumably controlling/evaluating C rather than C-2. Thinking out loud here, the view is that your mathematical uncertainty ought to be in one place, and your physical uncertainty should be built on top of your mathematical uncertainty. The mathematical uncertainty is a logical graph with symmetric inferences, the physical uncertainty is a directed acyclic graph. To form controlling counterfactuals, you update the mathematical uncertainty, including any logical inferences that take place in mathland, and watch it propagate downward into the physical uncertainty. When you've already observed facts that physically depend on mathematical decisions you control but you haven't yet made and hence whose values you don't know, then those observations stay in the causal, directed, acyclic world; when the counterfactual gets evaluated, they get updated in the Pearl, directional way, not the logical, symmetrical inferential way.

If we go down avenue (1), then we give primacy to our intuition that if-counterfactually you make a different decision, this logically controls the mathematical fact (D xor E) with E held constant, but does not logically control E with (D xor E) held constant. While this does sound intuitive in a sense, it isn't quite nailed down - after all, D is ultimately just as constant as E and (D xor E), and to change any of them makes the model equally inconsistent.

I agree this sounds intuitive. As I mentioned earlier, though, nailing this down is tantamount to... (read more)

1Eliezer Yudkowsky
I definitely want one big graph if I can get it. Sorry, yes, C. No, but whenever we see a physical fact F that depends on a decision C/D we're still in the process of making plus Something Else (E), then we express our uncertainty in the form of a causal graph with directed arrows from C to D, D to F, and E to F. Thus when we compute a counterfactual on C, we find that F changes, but E does not.

I already saw the $1M, so, by two-boxing, aren't I just choosing to be one of those who see their E module output True?

Not if a counterfactual consequence of two-boxing is that the large box (probably) would be empty (even though in fact it is not empty, as you can already see).

That's the same question that comes up in the original transparent-boxes problem, of course. We probably shouldn't try to recap that whole debate in the middle of this thread. :)

0Tyrrell_McAllister
Don't worry; I don't want to do that :). If I recall the original transparent-boxes problem correctly, I agree with you on what to do in that case. Just to check my memory, in the original problem, there are two transparent boxes, A and B. You see that A contains $1M and B contains $1000. You know that B necessarily contains $1000, but A would have contained $1M iff it were the case that you will decide to take only A. Otherwise, A would have been empty. The conclusion (with which I agree) is that you should take only A. Is that right? (If I'm misremembering something crucial, is there a link to the full description online?) [ETA: I see that you added a description to your post. My recollection above seems to be consistent with your description.] In the original problem, if we use the "many choosers" heuristic, there are no choosers who two-box and yet who get the $1M. Therefore, you cannot "choose to be" one of them. This is why two-boxing should have no appeal to you. In contrast, in your new problem, there are two-boxers who get the $1M and who get their E module to output True. So you can "choose to be" one of them, no? And since they're the biggest winners, that's what you should do, isn't it?

2) Treat differently mathematical knowledge that we learn by genuinely mathematical reasoning and by physical observation. In this case we know (D xor E) not by mathematical reasoning, but by physically observing a box whose state we believe to be correlated with D xor E. This may justify constructing a causal DAG with a node descending from D and E, so a counterfactual setting of D won't affect the setting of E.

Perhaps I'm misunderstanding you here, but D and E are Platonic computations. What does it mean to construct a causal DAG among Platonic comput... (read more)

Logical uncertainty has always been more difficult to deal with than physical uncertainty; the problem with logical uncertainty is that if you analyze it enough, it goes away. I've never seen any really good treatment of logical uncertainty.

But if we depart from TDT for a moment, then it does seem clear that we need to have causelike nodes corresponding to logical uncertainty in a DAG which describes our probability distribution. There is no other way you can completely observe the state of a calculator sent to Mars and a calculator sent to Venus, and ye... (read more)

1) Construct a full-blown DAG of math and Platonic facts, an account of which mathematical facts make other mathematical facts true, so that we can compute mathematical counterfactuals.

“Makes true” means logically implies? Why would that graph be acyclic? [EDIT: Wait, maybe I see what you mean. If you take a pdf of your beliefs about various mathematical facts, and run Pearl's algorithm, you should be able to construct an acyclic graph.]

Although I know of no worked-out theory that I find convincing, I believe that counterfactual inference (of the sort... (read more)

2Wei Dai
I've been reviewing some of this discussion, and noticed that Eliezer hasn't answered the question in your last paragraph. Here is his answer to one of my questions, which is similar to yours. But I'm afraid I still don't have a really good understanding of the answer. In other words, I'm still not really sure why we need all the extra machinery in TDT, when having a general math-counterfactual-solving module (what I called "mathematical intuition module") seems both necessary and sufficient. I wonder if you, or anyone else, understands this well enough to try to explain it. It might help me, and perhaps others, to understand Eliezer's approach to see it explained in a couple of different ways.
1Wei Dai
This is basically the approach I took in (what I now call) UDT1.

Have some Omega thought experiments been one shot, never to be repeated type deals or is my memory incorrect?

Yes, and that's the intent in this example as well. Still, it can be useful to look at the expected distribution of outcomes over a large enough number of trials that have the same structure, in order to infer the (counterfactual) probabilities that apply to a single trial.

The backward link isn't causal. It's a logical/Platonic-dependency link, which is indeed how TDT handles counterfactuals (i.e., how it handles the propagation of "surgical alterations" to the decision node C).

0JGWeissman
My understanding of the link in question, is that the logical value of the digit of pi causes Omega to take the physical action of putting the money in the box. See Eliezer's second approach:

(I refrained from doing this for the problem described in Gary's post, since it doesn't mention UDT at all, and therefore I'm assuming you want to find a TDT-only solution.)

Yes, I was focusing on a specific difficulty in TDT, But I certainly have no objection to bringing UDT into the thread too. (I myself haven't yet gotten around to giving UDT the attention I think it deserves.)

By "unsolvable" I mean that you're screwed over in final outcomes, not that TDT fails to have an output.

Oh ok. So it's unsolvable in the same sense that "Choose red or green. Then I'll shoot you." is unsolvable. Sometimes choice really is futile. :) [EDIT: Oops, I probably misunderstood what you're referring to by "screwed over".]

The interesting part of the problem is that, whatever you decide, you deduce facts about the background such that you know that what you are doing is the wrong thing.

Yes, assuming that you're t... (read more)

When:

D(M) = true, D(!M) = true, E = true

Omega fails.

No, but it seems that way because I neglected in my OP to supply some key details of the transparent-boxes scenario. See my new edit at the end of the OP.

0wedrifid
So, with those details, that resolves to "I get $0". This makes D(M) = !M the unambiguous 'correct' decision function.

In the setup in question, D goes into an infinite loop (since in the general case it must call a copy of C, but because the box is transparent, C takes as input the output of D).

No, because by stipulation here, D only simulates the hypothetical case in which the box contains $1M, which does not necessarily correspond to the output of D (see my earlier reply to JGWeissman:

http://lesswrong.com/lw/1qo/a_problem_with_timeless_decision_theory_tdt/1kpk).

I think this problem is based (at least in part) on an incoherence in the basic transparent box variant of Newcomb's problem.

If the subject of the problem will two-box if he sees the big box has the million dollars, but will one-box if he sees the big box is empty. Then there is no action Omega could take to satisfy the conditions of the problem.

The rules of the transparent-boxes problem (as specified in Good and Real) are: the predictor conducts a simulation that tentatively presumes there will be $1M in the large box, and then puts $1M in the box (for... (read more)

0JGWeissman
I drew a causal graph of this scenario (with the clarification you just provided), and in order to see the problem with TDT you describe, I would have to follow a causation arrow backwards, like in Evidential Decision Theory, which I don't think is how TDT handles counterfactuals.
0JGWeissman
Ah, I was working from different assumptions. That at least takes care of the basic clear box variant. I will have to think about the digit of pi variation again with this specification.

For now, let me just reply to your incidental concluding point, because that's brief.

I disagree that the red/green problem is unsolvable. I'd say the solution is that, with respect to the available information, both choices have equal (low) utility, so it's simply a toss-up. A correct decision algorithm will just flip a coin or whatever.

Having done so, will a correct decision algorithm try to revise its choice in light of its (tentative) new knowledge of what its choice is? Only if it has nothing more productive to do with its remaining time.

3Psy-Kosh
Actually, one can do even better than that. As (I think), Eliezer implied, the key is Omega saying those words. (about the simulated you getting it wrong) Did the simulated version receive that message too? (if yes, and if we assume Omega is always truthful, this implies an infinite recursion of simulations... let us not go invoking infinite nested computations willy-nilly.) If there was only a single layer of simulation, them Omega either gave that statement as input to it or did not. If yes, Omega is untruthful, which throws pretty much all of the standard reasoning about Omega out the window and we can simply take into account the possibility that Omega is blatantly lying. If Omega is truthful, even to the simulations, then the simulation would not have received that prefix message. In which case you are in a different state than simulated you was. So all you have to do is make the decision opposite to what you would have done if you hadn't heard that particular extra message. This may be guessed by simply one iteration of "I automatically want to guess color1... but wait, simulated me got it wrong, so I'll guess color2 instead" since "actual" you has the knowledge that the previous version of you got it wrong. If Omega lies to simulations and tells truth to "actuals" (and can somehow simulate without the simulation being conscious, so there's no ambiguity about which you are, yet still be accurate... (am skeptical but confused on that point)), then we have an issue. But then it would require Omega to take a risk: if when telling the lie to the simulation, the simulation then gets it right, then what does Omega tell "actual" you? ("actual" in quotes because I honestly don't know whether or not one could be modeled with sufficient accuracy, however indirectly, without the model being conscious. I'm actually kind of skeptical of the prospect of a perfectly accurate model not being conscious, although a model that can determine some properties/approximations of
2Eliezer Yudkowsky
By "unsolvable" I mean that you're screwed over in final outcomes, not that TDT fails to have an output. The interesting part of the problem is that, whatever you decide, you deduce facts about the background such that you know that what you are doing is the wrong thing. However, if you do anything differently, you would have to make a different deduction about the background facts, and again know that what you were doing was the wrong thing. Since we don't believe that our decision is capable of affecting the background facts, the background facts ought to be a fixed constant, and we should be able to alter our decision without affecting the background facts... however, as soon as we do so, our inference about the unalterable background facts changes. It's not 100% clear how to square this with TDT.

Actually, you're in a different camp than Laura: she agrees that it's incorrect to two-box regardless of any preference you have about the specified digit of pi. :)

The easiest way to see why two-boxing is wrong is to imagine a large number of trials, with a different chooser, and a different value of i, for each trial. Suppose each chooser strongly prefers that their trial's particular digit of pi be zero. The proportion of two-boxer simulations that end up with the digit equal to zero is no different than the proportion of one-boxer simulations that end u... (read more)

0Tyrrell_McAllister
But the proportion of two-boxers that saw $1M in the box that end up * with their digit being 0 and * with the $1M is even higher (1). I already saw the $1M, so, by two-boxing, aren't I just choosing to be one of those who see their E module output True?
0whpearson
Have some Omega thought experiments been one shot, never to be repeated type deals or is my memory incorrect? Yes I wasn't thinking through what would happen when the ith digit wasn't 0. You can't switch to one boxing in that case because you don't know when that would be, or rather when you see an empty box you are forced to do the same as when you see a full box due to the way the game is set up.

Everything you just said is true.*

Everything you just said is also consistent with everything I said in my original post.

*Except for one typo: you wrote (D or E) instead of (D xor E).

2whpearson
I'm in the same confused camp as Laura. This paragraph confuses me. Why is it the wrong decision? If Omega can perfectly predict the TDT and TDT sees 1 million dollars, then the TDT must be in a world that the ith digit of PI is 0. It is an unlikely world, to be sure.

If D=false and E=true and there's $1M in the box and I two-box, then (in the particular Newcomb's variant described above) the predictor is not wrong. The predictor correctly computed that (D xor E) is true, and set up the box accordingly, as the rules of this particular variant prescribe.

0LauraABJ
Yes- but your two-boxing didn't cause i=0, rather the million was there because i=0. I'm saying that if (D or E) = true and you get a million dollars, and you two-box, then you haven't caused E=0. E=0 before you two boxed, or if it did not, then omega was wrong and thought D = onebox, when in fact you are a two-boxer.

Sorry, the above post omits some background information. If E "depends on" C in the particular sense defined, then the TDT algorithm mandates that when you "surgically alter" the output of C in the factored causal graph, you then you must correspondingly surgically alter the output of E in the graph.

So it's not at all a matter of any intuitive connotation of "depends on". Rather, "depends on", in this context, is purely a technical term that designates a particular test that the TDT algorithm performs. And the algorithm's prescribed use of that test culminates in the algorithm making the wrong decision in the case described above (namely, it tells me to two-box when I should one-box).

2LauraABJ
No, I still don't get why adding in the ith digit of pi clause changes Newcome's problem at all. If omega says you'll one-box and you two-box then omega was wrong, plain and simple. The ith digit of pi is an independent clause. I don't see how one's desire to make i=0 by two-boxing after already getting the million is any different than one wanting to make omega wrong by two-boxing after getting the million. If you are the type of person who, after getting the million thinks, "Gee, I want i=0! I'll two-box!" Then omega wouldn't have given you the million to begin with. After determining that he would not give you the million, he'd look at the ith digit of pi and either put the million in or not. You two-boxing has nothing to do with i.
0thomblake
That fixed it

Hm, sorry, it's displaying for me in the same size as the rest of the site, so I'm not sure what you're seeing. I'll strip the formatting and see if that helps.

0Cyan
For me, the text within "You treat your choice... probability distributions over D" and "If that's what TDT... the specified digit is zero" show up in 7.5 point font.
3Cyan
Can you fix the font size issue too?

[In TDT] If you desire to smoke cigarettes, this would be observed and screened off by conditioning on the fixed initial conditions of the computation - the fact that the utility function had a positive term for smoking cigarettes, would already tell you that you had the gene. (Eells's "tickle".) If you can't observe your own utility function then you are actually taking a step outside the timeless decision theory as formulated.

Consider a different scenario where people with and without the gene both desire to smoke, but the gene makes that ... (read more)

Thanks, Eliezer--that's a clear explanation of an elegant theory. So far, TDT (I haven't looked carefully at UDT) strikes me as more promising than any other decision theory I'm aware of (including my own efforts, past and pending). Congratulations are in order!

I agree, of course, that TDT doesn't make the A6/A7 mistake. That was just a simple illustration of the need, in counterfactual reasoning (broadly construed), to specify somehow what to hold fixed and what not to, and that different ways of doing so specify different senses of counterfactual inferen... (read more)

If you could spend a day with any living person

I think you'd find me anticlimactic. :) But I do appreciate the kind words.

I agree that "choose" connotes multiple alternatives, but they're counterfactual antecedents, and when construed as such, are not inconsistent with determinism.

I don't know about being ontologically basic, but (what I think of as) physical/causal laws have the important property that they compactly specify the entirety of space-time (together with a specification of the initial conditions).

Just as a matter of terminology, I prefer to say that we can choose (or that we have a choice about) the output, rather than that we control it. To me, control has too strong a connotation of cause.

It's tricky, of course, because the concepts of choice-about and causal-influence-over are so thoroughly conflated that most people will use the same word to refer to both without distinction. So my terminology suggestion is kind of like most materialsts' choice to relinquish the word soul to refer to something extraphysical, retaining consciousness to refer to ... (read more)

1Eliezer Yudkowsky
Hm. To me, "choose" sounds like invoking the idea of multiple possibilities, while "control" sounds more determinism-compatible. Of course that is a mere matter of terminology. Though I'm not sure what you mean by "in the special case where a means-end link is causal" - my thesis was that if you are uncertain about the output of your decision computation, and you factor the universe the Pearlian way, then your logical decision will end up being, in the graph, the logical cause of box B containing a million dollars. You mean the special case where a means-end link is physical? But what is physics except math? Or are we assuming that the local causal relations in physics are more privileged as ontologically basic causes, whereas "logical causality" is just a convenient way of factoring uncertainty and a winning way to construe counterfactuals? (That last one may have some justice to it.)

To clarify: the agent in MCDT is a particular physical instantiation, rather than being timeless/Platonic (well, except insofar as physics itself is Platonic).

This is very cool, and I haven't digested it yet, but I wonder if it might be open to the criticism that you're effectively postulating the favored answer to Newcomb's Problem (and other such scenarios) by postulating that when you surgically alter one of the nodes, you correspondingly alter the nodes for the other instances of the computation. After all, the crux of the counterfactual-reasoning dilemma in Newcomb's Problem (and similarly in the Prisoner's Dilemma) is to jusftify the inference "If I choose both boxes, then (probably) so does the simul... (read more)

3Eliezer Yudkowsky
Replied at http://lesswrong.com/lw/164/timeless_decision_theory_and_metacircular/
1Gary_Drescher
To clarify: the agent in MCDT is a particular physical instantiation, rather than being timeless/Platonic (well, except insofar as physics itself is Platonic).

I didn't really get the purpose of the paper's analysis of "rationality talk". Ultimately, as I understood the paper, it was making a prescriptive argument about how people (as actually implemented) should behave in the scenarios presented (i.e, the "rational" way for them to behave).

Exactly. Unless "cultivating a disposition" amounts to a (subsequent-choice-circumventing) precommitment, you still need a reason, when you make that subsequent choice, to act in accordance with the cultivated disposition. And there's no good explanation for why that reason should care about whether or not you previously cultivated a disposition.

-6timtyler
0Eliezer Yudkowsky
(Though I think the paper was trying to use dispositions to define "rationality" more than to implement an agent that would consistently carry out those dispositions?)

I don't think DBDT gives the right answer if the predictor's snapshot of the local universe-state was taken before the agent was born (or before humans evolved, or whatever), because the "critical point", as Fisher defines it, occurs too late. But a one-box chooser can still expect a better outcome.

5Eliezer Yudkowsky
It looks to me like DBDT is working in the direction of TDT but isn't quite there yet. It looks similar to the sort of reasoning I was talking about earlier, where you try to define a problem class over payoff-determining properties of algorithms. But this isn't the same as a reflectively consistent decision theory, because you can only maximize on the problem class from outside the system - you presume an existing decision process or ability to maximize, and then maximize the dispositions using that existing decision theory. Why not insert yet another step? What if one were to talk about dispositions to choose particular disposition-choosing algorithms as being rational? In other words, maximizing "dispositions" from outside strikes me as close kin to "precommitment" - it doesn't so much guarantee reflective consistency of viewpoints, as pick one particular viewpoint to have control. As Drescher points out, if the base theory is a CDT, then there's still a possibility that DBDT will end up two-boxing if Omega takes a snapshot of the (classical) universe a billion years ago before DBDT places the "critical point". A base theory of TDT, of course, would one-box, but then you don't need the edifice of DBDT on top because the edifice doesn't add anything. So you could define "reflective consistency" in terms of "fixed point under precommitment or disposition-choosing steps". TDT is validated by the sort of reasoning that goes into DBDT, but the TDT algorithm itself is a plain-vanilla non-meta decision theory which chooses well on-the-fly without needing to step back and consider its dispositions, or precommit, etc. The Buck Stops Immediately. This is what I mean by "reflective consistency". (Though I should emphasize that so far this only works on the simple cases that constitute 95% of all published Newcomblike problems, and in complex cases like Wei Dai and I are talking about, I don't know any good fixed algorithm (let alone a single-step non-meta one).)
-6timtyler

Just to elaborate a bit, Nesov's scenario and mine share the following features:

  • In both cases, we argue that an agent should forfeit a smaller sum for the sake of a larger reward that would have been obtainted (couterfactually contingently on that forfeiture) if a random event had turned out differently than in fact it did (and than the agent knows it did).

  • We both argue for using the original coin-flip probability distribution (i.e., not-updating, if I've understood that idea correctly) for purposes of this decision, and indeed in general, even in mund

... (read more)
3SilasBarta
And I think I speak for everyone when I say we're glad you've started posting here! Your book was suggested as required rationalist reading. It certainly opened my eyes, and I was planning to write a review and summary so people could more quickly understand its insights. (And not to be a suck-up, but I was actually at a group meeting the other day where the ice-breaker question was, "If you could spend a day with any living person, who would it be?" I said Gary Drescher. Sadly, no one had heard the name.) I won't be able to contribute much to these discussions for a while, unfortunately. I don't have a firm enough grasp of Pearlean causality and need to read up more on that and Newcomb-like problems (halfway through your book's handling of it).

My book discusses a similar scenario: the dual-simulation version of Newcomb's Problem (section 6.3), in the case where the large box is empty (no $1M) and (I argue) it's still rational to forfeit the $1K. Nesov's version nicely streamlines the scenario.

Just to elaborate a bit, Nesov's scenario and mine share the following features:

  • In both cases, we argue that an agent should forfeit a smaller sum for the sake of a larger reward that would have been obtainted (couterfactually contingently on that forfeiture) if a random event had turned out differently than in fact it did (and than the agent knows it did).

  • We both argue for using the original coin-flip probability distribution (i.e., not-updating, if I've understood that idea correctly) for purposes of this decision, and indeed in general, even in mund

... (read more)