Evidential Decision Theory, Selection Bias, and Reference Classes

19 Post author: Qiaochu_Yuan 08 July 2013 05:16AM

See also: Does Evidential Decision Theory really fail Solomon's Problem?, What's Wrong with Evidential Decision Theory?

It seems to me that the examples usually given of decision problems where EDT makes the wrong decisions are really examples of performing Bayesian updates incorrectly. The basic problem seems to be that naive EDT ignores a selection bias when it assumes that an agent that has just performed an action should be treated as a random sample from the population of all agents who have performed that action. Said another way, naive EDT agents make some unjustified assumptions about what reference classes they should put themselves into when considering counterfactuals. A more sophisticated Bayesian agent should make neither of these mistakes, and correcting them should not in principle require moving beyond EDT but just becoming less naive in applying it. 

Elaboration

Recall that an EDT agent attempts to maximize conditional expected utility. The main criticism of EDT is that naively computing conditional probabilities leads to the conclusion that you should perform actions which are good news upon learning that they happened, as opposed to actions which cause good outcomes (what CDT attempts to do instead). For a concrete example of the difference, let's take the smoking lesion problem:

Smoking is strongly correlated with lung cancer, but in the world of the Smoker's Lesion this correlation is understood to be the result of a common cause: a genetic lesion that tends to cause both smoking and cancer. Once we fix the presence or absence of the lesion, there is no additional correlation between smoking and cancer.

Suppose you prefer smoking without cancer to not smoking without cancer, and prefer smoking with cancer to not smoking with cancer. Should you smoke?

In the smoking lesion problem, smoking is bad news, but it doesn't cause a bad outcome: learning that someone smokes, in the absence of further information, increases your posterior probability that they have the lesion and therefore cancer, but choosing to smoke cannot in fact alter whether you have the lesion / cancer or not. Naive EDT recommends not smoking, but naive CDT recommends smoking, and in this case it seems that naive CDT's recommendation is correct and naive EDT's recommendation is not. 

The naive EDT agent's reasoning process involves considering the following counterfactual: "if I observe myself smoking, that increases my posterior probability that I have the lesion and therefore cancer, and that would be bad. Therefore I will not smoke." But it seems to me that in this counterfactual, the naive EDT agent -- who smokes and then glumly concludes that there is an increased probability that they have cancer -- is performing a Bayesian update incorrectly, and that the incorrectness of this Bayesian update, rather than any fundamental problem with making decisions based on conditional probabilities, is what causes the naive EDT agent to perform poorly. 

Here are some other examples of this kind of Bayesian update, all of which seem obviously incorrect to me. They lead to silly decisions because they are silly updates. 

  • "If I observe myself throwing away expensive things, that increases my posterior probability that I am rich and can afford to throw away expensive things, and that would be good. Therefore I will throw away expensive things." (This example requires that you have some uncertainty about your finances -- perhaps you never check your bank statement and never ask your boss what your salary is.)
  • "If I observe myself not showering, that increases my posterior probability that I am clean and do not need to shower, and that would be good. Therefore I will not shower." (This example requires that you have some uncertainty about how clean you are -- perhaps you don't have a sense of smell or a mirror.)
  • "If I observe myself playing video games, that increases my posterior probability that I don't have any work to do, and that would be good. Therefore I will play video games." (This example requires that you have some uncertainty about how much work you have to do -- perhaps you write this information down and then forget it.) 

Selection Bias

Earlier I said that in the absence of further information, learning that someone smokes increases your posterior probability that they have the lesion and therefore cancer in the smoking lesion problem. But when a naive EDT agent is deciding what to do, they have further information: in the counterfactual where they're smoking, they know that they're smoking because they're in a counterfactual about what would happen if they smoked (or something like that). This information should screen off inferences about other possible causes of smoking, which is perhaps clearer in the bulleted examples above. If you consider what would happen if you threw away expensive things, you know that you're doing so because you're considering what would happen if you threw away expensive things and not because you're rich. 

Failure to take this information into account is a kind of selection bias: a naive EDT agent considering the counterfactual where they perform some action treats itself as a random sample from the population of similar agents who have performed such actions, but it is not in fact such a random sample! The sampling procedure, which consists of actually performing an action, is undoubtedly biased. 

Reference Classes

Another way to think about the above situation is that a naive EDT agent chooses inappropriate reference classes: when an agent performs an action, the appropriate reference class is not all other agents who have performed that action. It's unclear to me exactly what it is, but at the very least it's something like "other sufficiently similar agents who have performed that action under sufficiently similar circumstances." 

This is actually very easy to see in the smoker's lesion problem because of the following observation (which I think I found in Eliezer's old TDT writeup): suppose the world of the smoker's legion is populated entirely with naive EDT agents who do not know whether or not they have the lesion. Then the above argument suggests that none of them will choose to smoke. But if that's the case, then where does the correlation between the lesion and smoking come from? Any agents who smoke are either not naive EDT agents or know whether they have the lesion. In either case, that makes them inappropriate members of the reference class any reasonable Bayesian agent should be using.

Furthermore, if the naive EDT agents collectively decide to become slightly less naive and restrict their reference class to each other, they now find that smoking no longer gives any information about whether they have the lesion or not! This is a kind of reflective inconsistency: the naive recommendation not to smoke in the smoker's lesion problem has the property that, if adopted by a population of naive EDT agents, it breaks the correlations upon which the recommendation is based. 

The Tickle Defense

As it happens, there is a standard counterargument in the decision theory literature to the claim that EDT recommends not smoking in the smoking lesion problem. It is known as the "tickle defense," and runs as follows: in the smoking lesion problem, what an EDT agent should be updating on is not the action of smoking but an internal desire, or "tickle," prompting it to smoke, and once the presence or absence of such a tickle has been updated on it screens off any information gained by updating on the act of smoking or not smoking. So EDT + Tickles smokes on the smoking lesion problem. (Note that this prescription also has the effect of breaking the correlation claimed in the setup of the smoking lesion problem among a population of EDT + Tickles agents who don't know whether hey have the lesion or not. So maybe there's just something wrong with the smoking lesion problem.) 

The tickle defense is good in that it encourages ignoring less information than naive EDT, but it strikes me as a patch covering up part of a more general problem, namely the problem of how to choose appropriate reference classes when performing Bayesian updates (or something like that). So I don't find it a satisfactory rescuing of EDT. It doesn't help that there's a more sophisticated version known as the "meta-tickle defense" that recommends two-boxing on Newcomb's problem.

Sophisticated EDT?

What does a more sophisticated version of EDT, taking the above observations into account, look like? I don't know. I suspect that it looks like some version of TDT / UDT, where TDT corresponds to something like trying to update on "being the kind of agent who outputs this action in this situation" and UDT corresponds to something more mysterious that I haven't been able to find a good explanation of yet, but I haven't thought about this much. If someone else has, let me know.

Here are some vague thoughts. First, I think this comment by Stuart_Armstrong is right on the money:

I've found that, in practice, most versions of EDT are underspecified, and people use their intuitions to fill the gaps in one direction or the other.

A "true" EDT agent needs to update on all the evidence they've ever observed, and it's very unclear to me how to do this in practice. So it seems that it's difficult to claim with much certainty that EDT will or will not do a particular thing in a particular situation.

CDT-via-causal-networks and TDT-via-causal-networks seem like reasonable candidates for more sophisticated versions of EDT in that they formalize the intuition above about screening off possible causes of a particular action. TDT seems like it better captures this intuition in that it better attempts to update on the cause of an action in a hypothetical about that action (the cause being that TDT outputs that action). My intuition here is that it should be possible to see causal networks as arising naturally out of Bayesian considerations, although I haven't thought about this much either. 

AIXI might be another candidate. Unfortunately, AIXI can't handle the smoking lesion problem because it models itself as separate from the environment, whereas a key point in the smoking lesion problem is that an agent in the world of the smoking lesion has some uncertainty about its innards, regarded as part of its environment. Fully specifying sophisticated EDT might involve finding a version of AIXI that models itself as part of its environment. 

Comments (127)

Comment author: ciphergoth 08 July 2013 09:29:12AM *  5 points [-]

I keep hoping my "toxoplasmosis problem" alternative to the Smoking Lesion will take off!

The toxoplasmosis problem is a scenario that demonstrates a failing of EDT and a success of CDT. Toxoplasma gondii is a single-celled parasite carried by a significant fraction of humanity. It affects mammals in general and is primarily hosted by cats. Infection can have a wide range of negative effects (though most show no symptoms). It has also been observed that infected rats will be less afraid of cats, and even attracted to cat urine. Correlations have been shown between psychiatric disorders and toxoplasmosis, and it has been speculated (but not tested) that the disease may cause people to be more risk taking, and attracted to cats. Neurological mechanisms have been proposed (Flegr 2007).

http://intelligence.org/2013/04/19/altairs-timeless-decision-theory-paper-published/

Comment author: arundelo 08 July 2013 01:08:24PM 3 points [-]

Other alternatives to the Smoking Lesion Problem:

Eliezer has one with chewing gum and throat abcesses (PDF). "I have avoided [the Smoking Lesion] variant because in real life, smoking does cause lung cancer."

(According to that same document this class of problem is known as Solomon's Problem.)

orthonormal proposes the Aspirin Paradox.

Comment author: RichardKennaway 08 July 2013 10:36:11AM 2 points [-]

The toxoplasmosis version has the drawback that in the real world there is presumably also a causal link from adoring cats to getting infected, which has to be disregarded for The Toxoplasmosis Problem, just as the real causal effect of smoking on cancer must be disregarded in The Smoking Lesion.

Comment author: Qiaochu_Yuan 08 July 2013 10:15:42AM 1 point [-]

I like the toxoplasmosis problem but I wanted to stick to a more established example for the sake of familiarity.

Comment author: Wei_Dai 09 July 2013 01:08:59AM 4 points [-]

I suspect that it looks like some version of TDT / UDT, where TDT corresponds to something like trying to update on "being the kind of agent who outputs this action in this situation" and UDT corresponds to something more mysterious that I haven't been able to find a good explanation of yet, but I haven't thought about this much.

I can try to explain UDT a bit more if you say what you find mysterious about it. Or if you just want to think about it some more, keep in mind that UDT was designed to solve a bunch of problems at the same time, so if you see some feature of it that seems unmotivated, it might be trying to solve a problem that you haven't focused on yet.

Another thing to keep in mind is that UDT is currently formulated mainly for AI rather than human use (whereas you seem to be thinking mostly in human terms). For example it assumes that the agent has full "bit-level" access to its own source code, memories and sensory data, which allows UDT to conceptualize a decision (the thing you're deriving consequences from, or conditioning upon) as a logical fact about the input/output map implemented by a certain piece of code. It avoids human concepts like "being the kind of", "agent", or "situation", which might be hard to fully specify and unambiguously translate to code. The downside is that it's hard for humans (who do not have full introspective access to their own minds and do think in terms of high level concepts) to apply UDT.

Comment author: Qiaochu_Yuan 09 July 2013 05:10:26AM 3 points [-]

Even more than an explanation, I would appreciate an explanation on the LessWrong Wiki because there currently isn't one! I've just reread through the LW posts I could find about UDT and I guess I should let them stew for awhile. I might also ask people at the current MIRI workshop for their thoughts in person.

Another thing to keep in mind is that UDT is currently formulated mainly for AI rather than human use (whereas you seem to be thinking mostly in human terms).

Only as an intuition pump; when it's time to get down to brass tacks I'm much happier to talk about a well-specified program than a poorly-specified human.

Comment author: Tyrrell_McAllister 09 July 2013 07:34:58PM *  2 points [-]

I wrote a brief mathematical write-up of "bare bones" UDT1 and UDT1.1. The write-up describes the version that Wei Dai gave in his original posts. The write-up doesn't get into more advanced versions that invoke proof-length limits, try to "play chicken with the universe", or otherwise develop how the "mathematical intuition module" is supposed to work.

Without trying to make too much of the analogy, I think that I would describe TDT as "non-naive" CDT, and UDT as "non-naive" EDT.

Comment author: Qiaochu_Yuan 09 July 2013 09:17:42PM 1 point [-]

In this writeup it really seems like all of the content is in how the mathematical intuition module works.

Comment author: Tyrrell_McAllister 09 July 2013 11:34:25PM 0 points [-]

This is not much of an exaggeration. Still, UDT basically solves many toy problems where we get to declare what the output of the MIM is ("Omega tells you that ...").

Comment author: Wei_Dai 10 July 2013 09:31:13PM 0 points [-]

Even more than an explanation, I would appreciate an explanation on the LessWrong Wiki because there currently isn't one!

What kind of explanation are you looking for, though? The best explanation of UDT I can currently give, without some sort of additional information about where you find it confusing or how it should be improved, is in my first post about it, Towards a New Decision Theory.

Only as an intuition pump; when it's time to get down to brass tacks I'm much happier to talk about a well-specified program than a poorly-specified human.

Ah, ok. Some people (such as Ilya Shpitser) do seem to be thinking mostly in terms of human application, so it seems a good idea to make the distinction explicit.

Comment author: moemurray 21 July 2013 01:55:52AM 0 points [-]

UDT was designed to solve a bunch of problems

Are there any problems that (U|T)DT are designed to solve which are not one-shot problems? I apologize if this sounds like a stupid question, but I'm having some difficulty understanding all of the purported problems. Those I understand are one-shot problems like the Prisoner's Dilemma and the Newcomb Problem. Is there anything like the Iterated Prisoner's Dilemma for which (E|C)DT is inadequate, but (U|T)DT solves?

Comment author: David_Gerard 08 July 2013 02:38:40PM 8 points [-]

Suggestion: credit and link the cartoon.

Comment author: ciphergoth 08 July 2013 06:16:14PM 2 points [-]
Comment author: Qiaochu_Yuan 08 July 2013 08:06:24PM 1 point [-]

I can link to it, but I'm not sure where the credit would go. Alt text?

Comment author: David_Gerard 08 July 2013 09:42:05PM 0 points [-]

Footnote?

Comment author: IlyaShpitser 09 July 2013 04:11:49AM *  5 points [-]

Look, HIV patients who get HAART die more often (because people who get HAART are already very sick). We don't get to see the health status confounder because we don't get to observe everything we want. Given this, is HAART in fact killing people, or not?

EDT does the wrong thing here. Any attempt to not handle the confounder properly does the wrong thing here. If something does handle the confounder properly, it's not EDT anymore (because it's not going to look at E[death|HAART]). If you are willing to call such a thing "EDT", then EDT can mean whatever you want it to mean.


Here's the specific example to work out using whatever version of EDT you want:

People get HAART over time (let's restrict to 2 time slices for simplicity). The first time HAART is given (A0) it is randomized. The second time HAART is given (A1), it is given by a doctor according to some (known) policy based on vitals after A0 was given and some time passed (L0). Then we see if the patient dies or not (Y). The graph is this:

A0 -> L0 -> A1 -> Y, with A0 -> A1 and A0 -> Y. There is also health status confounding between L0 and Y (a common cause we don't get to see). Based on this data, how do we determine whether giving people HAART at A0 and A1 is a good idea?


It's true that you can formalize, say fluid dynamics in set theory if you wanted. Does this then mean fluid dynamics is set theoretic? One needs to pick the right level of abstraction.


I think discussions of AIXI, source-code aware agents, etc. in the context of decision theories are a bit sterile because they are very far from actual problems people want to solve (e.g. is this actual non-hypothetical drug killing actual non-hypothetical people?)

Comment author: twanvl 09 July 2013 07:31:58PM 2 points [-]

EDT does the wrong thing here. Any attempt to not handle the confounder properly does the wrong thing here. If something does handle the confounder properly, it's not EDT anymore (because it's not going to look at E[death|HAART])

According to the wikipedia page, EDT uses conditional probabilities. I.e.

V(HAART) = P(death|HAART)U(death) + P(!death|HAART)U(!death).

The problem is not with this EDT formula in general, but with how these probabilities are defined and estimated. In reality, they are based on a sample, and we are making a decision for a particular patient, i.e.

V(HAART-patient1) = P(death-patient1|HAART-patient1)U(death-patient1) + P(!death-patient1|HAART-patient1)U(!death-patient1).

We don't know any of these probabilities exactly, since you will not find out whether the patient dies until after you give or not give him the treatment. So instead, you estimate the probabilities based on other patients. A completely brain-dead model would use the reference class of all people, and conclude that HAART kills. But a more sophisticated model would include something like P(patient1 is similar to patient2) to define a better reference class, and it would also take into account confounders.

Comment author: IlyaShpitser 09 July 2013 07:39:28PM 0 points [-]

But a more sophisticated model would include something like P(patient1 is similar to patient2) to define a better reference class, and it would also take into account confounders.

Ok -- the data is as I describe above. You don't get any more data. What is your EDT solution to this example?

Comment author: twanvl 09 July 2013 10:24:12PM 2 points [-]

You didn't give any data, just a problem description. Am I to assume that there is a bunch of {A0, L0, A1, Y} records are available? And you say that the policy for giving A1 is known, is the information that this decision is based on (health status) also available?

In any case, you end up with the problem of estimating a causal structure from observational data, which is a challenging problem. But I don't see what this has to do with EDT vs another DT. Wouldn't this other decision theory face exactly the same problem?

Comment author: IlyaShpitser 09 July 2013 11:49:09PM *  1 point [-]

Am I to assume that there is a bunch of {A0, L0, A1, Y} records are available? And you say that the policy for giving A1 is known, is the information that this decision is based on (health status) also available?

You have (let's say infinitely many to avoid dealing with stats issues) records for { A0, L0, A1, Y }. You know they come from the causal graph I specified (complete with an unobserved confounder for health status on which no records exist. You don't need to learn the graph, you just need to tell me whether HAART is killing people or not and why, using EDT.

Comment author: twanvl 10 July 2013 09:56:05AM 1 point [-]

There is no single 'right answer' in this case. The answer will depend on your prior for the confounder.

As others have noted, the question "is HAART killing people?" has nothing to do with EDT, or any other decision theory for that matter. The question that decision theories answer is "should I give HAART to person X?"

Comment author: IlyaShpitser 10 July 2013 03:40:33PM *  1 point [-]

There is no single 'right answer' in this case. The answer will depend on your prior for the confounder.

As others have noted, the question "is HAART killing people?" has nothing to do with EDT ...

I think I disagree with both of these assertions. First, there is the "right answer," and it has nothing to do with priors or Bayesian reasoning. In fact there is no model uncertainty in the problem -- I gave you "the truth" (the precise structure of the model and enough data to parameterize it precisely so you don't have to pick or average among a set of alternatives). All you have to do is answer a question related to a single parameter of the model I gave you. The only question is which parameter of the model I am asking you about. Second, it's easy enough to rephrase my question to be a decision theory question (I do so here:

http://lesswrong.com/lw/hwq/evidential_decision_theory_selection_bias_and/9cdk).

Comment author: twanvl 10 July 2013 04:41:12PM *  0 points [-]

To quote your other comment:

Ok -- a patient comes in (from the same reference class as the patients in your data). This patient has HIV. Do you put him on HAART or not?

You put the patient on HAART if and only if V(HAART) > V(!HAART), where

V(HAART) = P(death|HAART)U(death) + P(!death|HAART)U(!death).
V(!HAART) = P(death|!HAART)U(death) + P(!death|!HAART)U(!death).

In these formulas HAART means "(decide to) put this patient on HAART" and death means "this patient dies".

For concreteness, we can assume that the utility of death is low, say 0, while the utility of !death is positive. Then the decision reduces to

P(!death|HAART) > P(!death|!HAART)

So if you give me P(!death|HAART) and P(!death,!HAART) then I can give you a decision.

Comment author: IlyaShpitser 10 July 2013 04:49:22PM *  3 points [-]

Ok. This is wrong. The problem is P(death|HAART) isn't telling you whether HAART is bad or not (due to unobserved confounding). I have already specified that there is confounding by health status (that is, HAART helps, but was only given to people who were very sick). What you need to compare is

for various values of A1, and A0.

Comment author: twanvl 10 July 2013 06:22:11PM 0 points [-]

Note that I defined HAART as "put this patient on HAART", not the probability of death when giving HAART in general (maybe I should have used a different notation).

If I understand your model correctly then

A0 = is HAART given at time t=0 (boolean)
L0 = time to wait (seconds, positive)
A1 = is HAART given (again) at time t=L0 (boolean)

with the confounding variable H1, the health at time t=L0, which influences the choice of A1. You didn't specify how L0 was determined, is it fixed or does it also depend on the patient's health? Your formula above suggests that it depends only on the choice A0.

Now a new patient comes in, and you want to know whether you should pick A0=true/false and A1=true/false. Now for the new patient x, you want to estimate P(death[x] | A0[x],A1[x]). If it was just about A0[x], then it would be easy, since the assignment was randomized, so we know that A0 is independent of any confounders. But this is not true for A1, in fact, we have no good data with which to estimate A1[x], since we only have samples where A1 was chosen according to the health-status based policy.

Comment author: Qiaochu_Yuan 09 July 2013 04:40:14AM 1 point [-]

Look, HIV patients who get HAART die more often (because people who get HAART are already very sick). We don't get to see the health status confounder because we don't get to observe everything we want. Given this, is HAART in fact killing people, or not?

Well, of course I can't give the right answer if the right answer depends on information you've just specified I don't have.

If something does handle the confounder properly, it's not EDT anymore (because it's not going to look at E[death|HAART]).

Again, I think there is a nontrivial selection bias / reference class issue here that needs to be addressed. The appropriate reference class for deciding whether to give HAART to an HIV patient is not just the set of all HIV patients who've been given HAART precisely because of the possibility of confounders.

I think discussions of AIXI, source-code aware agents, etc. in the context of decision theories are a bit sterile because they are very far from actual problems people want to solve (e.g. is this actual non-hypothetical drug killing actual non-hypothetical people?)

In actual problems people want to solve, people have the option of acquiring more information and working from there. It's plausible that with enough information even relatively bad decision theories will still output a reasonable answer (my understanding is that this kind of phenomenon is common in machine learning, for example). But the general question of what to do given a fixed amount of information remains open and is still interesting.

Comment author: IlyaShpitser 09 July 2013 06:55:11AM *  7 points [-]

Well, of course I can't give the right answer if the right answer depends on information you've just specified I don't have.

I think there is "the right answer" here, and I think it does not rely on observing the confounder. If your decision theory does then (a) your decision theory isn't as smart as it could be, and (b) you are needlessly restricting yourself to certain types of decision theories.

The appropriate reference class for deciding whether to give HAART to an HIV patient is not just the set of all HIV patients who've been given HAART precisely because of the possibility of confounders.

People have been thinking about confounders for a long time (earliest reference known to me to a "randomized" trial is the book of Daniel, see also this: http://ije.oxfordjournals.org/content/33/2/247.long). There is a lot of nice clever math that gets around unobserved confounders developed in the last 100 years or so. Saying "well we just need to observe confounders" is sort of silly. That's like saying "well, if you want to solve this tricky computational problem forget about developing new algorithms and that whole computational complexity thing, and just buy more hardware."

In actual problems people want to solve, people have the option of acquiring more information and working from there.

I don't know what kind of actual problems you work on, but the reality of life in stats, medicine, etc. is you have your dataset and you got to draw conclusions from it. The dataset is crappy -- there is probably selection bias, all sorts of missing data, censoring, things we would really liked to have known but which were never collected, etc. This is just a fact of life for folks in the trenches in the empirical sciences/data analysis. The right answer here is not denial, but new methodology.

Comment author: William_Quixote 12 July 2013 09:34:48PM *  2 points [-]

There is a lot of nice clever math that gets around unobserved confounders developed in the last 100 years or so.

For non experts in the thread, what's the name of this area and is there a particular introductory text you would recommend?

Comment author: IlyaShpitser 13 July 2013 06:38:19AM *  3 points [-]

Thanks for your interest! The name of the area is "causal inference." Keywords: "standardization" (in epidemiology), "confounder or covariate adjustment," "propensity score", "instrumental variables", "back-door criterion," "front-door criterion," "g-formula", "potential outcomes", "ignorability," "inverse probability weighting," "mediation analysis," "interference", etc.

Pearl's Causality book (http://www.amazon.com/Causality-Reasoning-Inference-Judea-Pearl/dp/052189560X/ref=pd_sim_sbs_b_1) is a good overview (but doesn't talk a lot about statistics/estimation). Early references are Sewall Wright's path analysis paper from 1921 (http://naldc.nal.usda.gov/download/IND43966364/PDF) and Neyman's paper on potential outcomes from 1923 (http://www.ics.uci.edu/~sternh/courses/265/neyman_statsci1990.pdf). People say either Sewall Wright or his dad invented instrumental variables also.

Comment author: William_Quixote 13 July 2013 04:10:35PM 2 points [-]

Thanks

Comment author: endoself 09 July 2013 12:09:52PM 4 points [-]

Look, HIV patients who get HAART die more often (because people who get HAART are already very sick). We don't get to see the health status confounder because we don't get to observe everything we want. Given this, is HAART in fact killing people, or not?

Well, of course I can't give the right answer if the right answer depends on information you've just specified I don't have.

You're sort of missing what Ilya is trying to say. You might have to look at the actual details of the example he is referring to in order for this to make sense. The general idea is that even though we can't observe certain variables, we still have enough evidence to justify the causal model where HAART leads to fewer people die, so we can conclude that we should prescribe it.

I would object to Ilya's more general point though. Saying that EDT would use E(death|HAART) to determine whether to prescribe HAART is making the same sort of reference class error you discuss in the post. EDT agents use EDT, not the procedures used to A0 and A1 in the example, so we really need to calculate E(death|EDT agent prescribes HAART). I would expect this to produce essentially the same results as a Pearlian E(death | do(HAART)), and would probably regard it as a failure of EDT if it did not add up to the same thing, but I think that there is value in discovering how exactly this works out, if it does.

Comment author: IlyaShpitser 09 July 2013 03:19:18PM *  3 points [-]

A challenge (not in a bad sense, I hope): I would be interested in seeing an EDT derivation of the right answer in this example, if anyone wants to do it.

Comment author: [deleted] 11 July 2013 03:11:01PM 3 points [-]

Yeah, unfortunately everyone who responded to your question went all fuzzy in the brain and started philosophical evasive action.

Comment author: nshepperd 10 July 2013 01:02:05AM 0 points [-]

Um, since when were decision theories for answering epistemic questions? Are you trying to make some kind of point about how evidential decision theorists use incorrect math that ignores confounders?

Comment author: IlyaShpitser 10 July 2013 01:05:36AM *  3 points [-]

Um, since when were decision theories for answering epistemic questions?

???

How are you supposed to make good decisions?

Are you trying to make some kind of point about how evidential decision theorists use incorrect math that ignores confounders?

Well, I am trying to learn why people think EDT isn't terminally busted. I gave a simple example that usually breaks EDT as I understand it, and I hope someone will work out the right answer with EDT to show me where I am going wrong.

Comment author: nshepperd 10 July 2013 02:58:15AM -1 points [-]

How are you supposed to make good decisions?

Use decision theory. The point is that it's not decision theory that tells you your shoelaces are undone when you look at your feet. "Are my shoelaces undone?" is a purely epistemic question, that has nothing to do with making decisions. But upon finding out that your shoelaces are undone, a decision theory might decide to do X or Y, after discovering (by making a few queries to the epistemic-calculations module of your brain) that certain actions will result in the shoelaces being tied again, that that would be safer, etc etc.

You're complaining that EDT is somehow unable to solve the question of "is HAART bad" given some useless data set when that doesn't even sound like a question EDT should be trying to answer in the first place—but rather, a question you would try to answer with standard multivariate statistics.

Comment author: IlyaShpitser 10 July 2013 06:33:11AM *  1 point [-]

Ok -- a patient comes in (from the same reference class as the patients in your data). This patient has HIV. Do you put him on HAART or not? Your utility function is minimizing patient deaths. By the way, if you do the wrong thing, you go to jail for malpractice.

Comment author: nshepperd 10 July 2013 07:30:06AM 0 points [-]

How about we dispense with this and you tell us if you know how to extract information about the usefulness (or not) of HAART from a data set like this?

Comment author: IlyaShpitser 10 July 2013 03:37:31PM *  3 points [-]

Ok, first things first.

when that doesn't even sound like a question EDT should be trying to answer in the first place

Do you agree that "Do you put him on HAART or not? Your utility function is minimizing patient deaths." is in fact a kind of question EDT, or decision theories in general, should be trying to answer?

In fact, I already said elsewhere in this thread that I think there is the right answer to this question, and this right answer is to put the patient on HAART (whereas my understanding of EDT is that it will notice that E[death | HAART] > E[death | no HAART], and conclude that HAART is bad). The way you get the answer is no secret either, it's what is called 'the g-formula' or 'truncated factorization' in the literature. I have been trying to understand how my understanding of EDT is wrong. If people's attempt to fix this is to require that we observe all unobserved confounders for death, then to me this says EDT is not a very good decision theory (because other decision theories can get the right answer here without having to observe anything over what I specified). If people say that the right answer is to not give HAART then that's even worse (e.g. they will kill people and go to jail if they actually practice medicine like that).

Comment author: nshepperd 11 July 2013 12:11:29AM *  -1 points [-]

Do you agree that "Do you put him on HAART or not? Your utility function is minimizing patient deaths." is in fact a kind of question EDT, or decision theories in general, should be trying to answer?

Yes. However a decision theory in general contains no specific prescriptions for obtaining probabilities from data, such as "oh, use the parametric g-formula". In general, they have lists of probabilistic information that they require.

E[death | HAART] > E[death | no HAART]

Setting that aside, I assume you mean the above to mean "count the proportion of samples without HAART with death, and compare to proportion of samples with HAART with death". Ignoring the fact that I thought there were no samples without HAART at t=0, what if half of the samples referred to hamsters, rather than humans?

No-one would ever have proposed EDT as a serious decision theory if they intended one to blindly count records while ignoring all other relevant "confounding" information (such as species, or health status). In reality, the purpose of the program of "count the number of people who smoke who have the lesion" or "count how many people who have HAART die" is to obtain estimates of P(I have the lesion | I smoke) or P(this patient dies | I give this patient HAART). That is why we discard hamster samples, because there are good a priori reasons to think that the survival of hamsters and humans is not highly correlated, and "this patient" is a human.

Comment author: IlyaShpitser 11 July 2013 03:41:28AM *  2 points [-]

Ignoring the fact that I thought there were no samples without HAART at t=0, what if half of the samples referred to hamsters, rather than humans?

Well, there is in reality A0 and A1. I choose this example because in this example it is both the case that E[death | A0, A1] is wrong, and \sum_{L0} E[death | A0,A1,L0] p(L0) (usual covariate adjustment) is wrong, because L0 is a rather unusual type of confounder. This example was something naive causal inference used to get wrong for a long time.

More generally, you seem to be fighting the hypothetical. I gave a specific problem on only four variables, where everything is fully specified, there aren't hamsters, and which (I claim) breaks EDT. You aren't bringing up hamsters with Newcomb's problem, why bring them up here? This is just a standard longitudinal design: there is nothing exotic about it, no omnipotent Omegas or source-code reading AIs.

However a decision theory in general contains no specific prescriptions for obtaining probabilities from data.

I think you misunderstand decision theory. If you were right, there would be no difference between CDT and EDT. In fact, the entire point of decision theories is to give rules you would use to make decisions. EDT has a rule involving conditional probabilities of observed data (because EDT treats all observed data as evidence). CDT has a rule involving a causal connection between your action and the outcome. This rule implies, contrary to what you claimed, that a particular method must be used to get your answer from data (this method being given by the theory of identification of causal effects) on pain of getting garbage answers and going to jail.

Comment author: nshepperd 11 July 2013 05:47:54AM *  1 point [-]

You aren't bringing up hamsters with Newcomb's problem, why bring them up here?

I said why I was bringing them up. To make the point that blindly counting the number of events in a dataset satisfying (action = X, outcome = Y) is blatantly ridiculous, and this applies whether or not hamsters are involved. If you think EDT does that then either you are mistaken, or everyone studying EDT are a lot less sane than they look.

I think you misunderstand decision theory. If you were right, there would be no difference between CDT and EDT.

The difference is that CDT asks for P(utility | do(action), observations) and EDT asks for P(utility | action, observations). Neither CDT or EDT specify detailed rules for how to calculate these probabilities or update on observations, or what priors to use. Indeed, those rules are normally found in statistics textbooks, Pearl's Causality or—in the case of the g-formula—random math papers.

Comment author: endoself 08 July 2013 08:57:20AM 2 points [-]

UDT corresponds to something more mysterious

Don't update at all, but instead optimize yourself, viewed as a function from observations to actions, over all possible worlds.

There are tons of details, but it doesn't seem impossible to summarize in a sentence.

Comment author: Manfred 08 July 2013 01:03:17PM 0 points [-]

Or even simpler: find the optimal strategy, then do that.

Comment author: RichardKennaway 08 July 2013 06:46:50AM 2 points [-]

My intuition here is that it should be possible to see causal networks as arising naturally out of Bayesian considerations

You disagree, then, with Pearl's dictum that causality is a primitive concept, not reducible to any statistical construction?

The Smoker's Lesion problem is completely dissolved by using the causal information about the lesion. Without that information it cannot be. The correlations among Smoking, Lesion, and Cancer, on their own, allow of the alternative causal possibilities that Smoking causes Lesion, which causes Cancer, or that Cancer causes Lesion, which causes Smoking (even in the presence of the usual causal assumptions of DAG, Markov, and Faithfulness). These three causal graphs cannot be distinguished by the observational statistics. The causal information given in the problem is an essential part of its statement, and no decision theory which ignores causation can solve it.

EDT recommends the action "which, conditional on your having chosen it, gives you the best expectations for the outcome." That formulation glosses over whether that conditional expectation is based on the statistical correlations observed in the population (i.e. ignoring causation), or the correlations resulting from considering the actions as interventions in a causal network. It is generally understood as the former; attempts to fix it consist of changing it to use the latter. The only differences among these various attempts is how willing their proposers are to simply say "do causal reasoning".

When you talk about selection bias, you talk about counterfactuals (do-actions, in Pearl's notation, a causal concept). The Tickle defence introduces a causal hypothesis (the tickle prompting, i.e. causing smoking). I don't follow the reference class part, but it doesn't seem to cover the situation of an EDT reasoner advising someone else who professes an inclination to smoke. That is just as much a problem for EDT as the original version. It is also a problem that AIXI can be set to solving. What might its answer be?

Comment author: Qiaochu_Yuan 08 July 2013 10:28:29AM *  1 point [-]

You disagree, then, with Pearl's dictum that causality is a primitive concept, not reducible to any statistical construction?

No. For example, AIXI is what I would regard as essentially a Bayesian agent, but it has a notion of causality because it has a notion of the environment taking its actions as an input. What I mean is more like wondering if AIXI would invent causal networks.

It is generally understood as the former; attempts to fix it consist of changing it to use the latter.

I think this is too narrow a way to describe the mistake that naive EDT is making. First, I hope you agree that even naive EDT wouldn't use statistical correlations in a population of agents completely unrelated to it (for example, agents who make their decisions randomly). But naive EDT may be in the position of existing in a world where it is the only naive EDT agent, although there may be many agents which are similar but not completely identical to it. How should it update in this situation? It might try to pick a population of agents sufficiently similar to itself, but then it's unclear how the fact that they're similar but not identical should be taken into account.

AIXI, by contrast, would do something more sophisticated. Namely, its observations about the environment, including other agents similar to itself, would all update its model of the environment.

I don't follow the reference class part, but it doesn't seem to cover the situation of an EDT reasoner advising someone else who professes an inclination to smoke.

It seems like some variant of the tickle defense covers this. Once the other agent professes their inclination to smoke, that screens off any further information obtained by the other agent smoking or not smoking.

It is also a problem that AIXI can be set to solving. What might its answer be?

I guess AIXI could do something like start with a prior over possible models of how various actions, including smoking, could affect the other agent, update, then use the posterior distribution over models to predict the effect of interventions like smoking. But this requires a lot more data than is usually given in the smoking lesion problem.

Comment author: endoself 08 July 2013 10:54:46AM *  4 points [-]

No. For example, AIXI is what I would regard as essentially a Bayesian agent, but it has a notion of causality because it has a notion of the environment taking its actions as an input.

This looks like a symptom of AIXI's inability to self-model. Of course causality is going to look fundamental when you think you can magically intervene from outside the system.

Do you share the intuition I mention in my other comment? I feel that they way this post reframes CDT and TDT as attempts to clarify bad self-modelling by naive EDT is very similar to the way I would reframe Pearl's positions as an attempt to clarify bad self-modelling by naive probability theory a la AIXI.

Comment author: Qiaochu_Yuan 08 July 2013 08:14:55PM 0 points [-]

So your intuition is that causality isn't fundamental but should fall out of correct self-modeling? I guess that's also my intuition, and I also don't know how to make that precise.

Comment author: endoself 08 July 2013 10:37:09AM *  0 points [-]

These three causal graphs cannot be distinguished by the observational statistics. The causal information given in the problem is an essential part of its statement, and no decision theory which ignores causation can solve it.

I think this isn't actually compatible with the thought experiment. Our hypothetical agent knows that it is an agent. I can't yet formalize what I mean by this, but I think that it requires probability distributions corresponding to a certain causal structure, which would allow us to distinguish it from the other graphs. I don't know how to write down a probability distribution that contains myself as I write it, but it seems that such a thing would encode the interventional information about the system that I am interacting with on a purely probabilistic level. If this is correct, you wouldn't need a separate representation of causality to decide correctly.

Comment author: RichardKennaway 12 July 2013 04:39:30PM 0 points [-]

Our hypothetical agent knows that it is an agent. I can't yet formalize what I mean by this, but I think that it requires probability distributions corresponding to a certain causal structure, which would allow us to distinguish it from the other graphs

How about: an agent, relative to a given situation described by a causal graph G, is an entity that can perform do-actions on G.

Comment author: endoself 14 July 2013 07:51:42AM *  0 points [-]

No, that's not what I meant at all. In what you said, the agent needs to be separate from the system in order to preform do-actions. I want an agent that knows it's an agent, so it has to have a self-model and, in particular, has to be inside the system that is modelled by our causal graph.

One of the guiding heuristics in FAI theory is that an agent should model itself the same way it models other things. Roughly, the agent isn't actually tagged as different from nonagent things in reality, so any desired behaviour that depends on correctly making this distinction cannot be regulated with evidence as to whether it is actually making the distinction the way we want it to. A common example of this is the distinction between self-modification and creating a successor AI; an FAI should not need to distinguish these, since they're functionally the same. These sorts of ideas are why I want the agent to be modelled within its own causal graph.

Comment author: Will_Sawin 08 July 2013 05:34:20AM 2 points [-]

How useful is it to clarify EDT until it becomes some decision theory with a different, previously determined name?

Comment author: Qiaochu_Yuan 08 July 2013 06:24:00AM 3 points [-]

It would be useful for my mental organization of how decision theory works. I don't know if it would be useful to anyone else though.

Comment author: RobbBB 09 July 2013 12:12:02AM *  2 points [-]

I don't much care what we call the thing, but exploring the logical relations between conventional EDT and other anti-CDT options could be extremely useful for persuading EDTists to adopt TDT, UDT, or some other novel theory. Framing matters even for academics.

Comment author: Protagoras 08 July 2013 07:12:09AM 1 point [-]

Lots of interesting points, but on your final paragraph, is a theory that models the agent as part of its environment necessarily possible? Since the model is part of the agent, it would have to include the model as part of the model. I suppose that isn't an outright contradiction, as there are of course mathematical structures with proper parts equivalent to the whole, but does it seem likely that plausible models human agents can construct could be like that?

It seems to me that there are logical constraints on self-knowledge, related to the well-known paradoxes associated with self-reference. I further think, though it would perhaps go too far afield for a mere comment to specify in detail, that quite a number of philosophical discussions of free will have touched on this issue (though usually none too clearly). Perhaps causal decision theory is partly motivated by people thinking that no version of evidential decision theory will be able to escape gaps in the evidence generated by the limitations on self-knowledge (I could believe this was part of the motivation of David Lewis). Note that this problem doesn't require there to be anything spooky about human choices (if the problem is a restriction on self-knowledge, humans could still be perfectly determined, and any particular human could be perfectly predicted by someone other than themselves).

Comment author: nshepperd 08 July 2013 12:10:22PM 2 points [-]

Pretty sure humans normally model themselves as part of the environment. Seems a bit excessive to conjecture the impossibility of something humans do every day (even if "approximately") without particularly strong evidence. (Note that quines exist and people are able to understand that brains are made of neurons.)

Comment author: Protagoras 08 July 2013 01:17:09PM -1 points [-]

"Approximately" would be important. A lot of the discussions of decision theory seem to be trying to come up with something logically perfect, some theory which in principle could always give the best answer (though obviously no human would ever implement any theory perfectly). It thus seems relevant whether in principle perfection is possible. If it isn't, then the evaluation of decision theories must somehow compare severity of flaws, rather than seeking flawlessness, and the discussions around here don't generally seem to go that way..

That being said, I'm not sure I agree here anyway. It seems that people's minds are sufficiently complicated and disunified that it is certainly possible for part of a person to model another part of the same person. I am not certain that self-modeling ever takes any other form; it is not obvious that it is ever possible for part of a person to successfully model that exact part.

Comment author: fractalman 08 July 2013 08:36:02AM 1 point [-]

I'm a bit tired at the moment, but my more or less cached reply is "use a coarse-grained simulation of yourself."

Comment author: Qiaochu_Yuan 08 July 2013 10:17:42AM 0 points [-]

Who knows? I think this is a really interesting question and hopefully some of the work going on in MIRI workshops will be relevant to answering it.

Comment author: lukeprog 22 July 2013 04:01:58AM 1 point [-]
Comment author: Qiaochu_Yuan 18 July 2013 12:46:43AM 1 point [-]

Approximately this point appears to have been made in the decision theory literature already, in Against causal decision theory by Huw Price.

Comment author: AlexMennen 08 July 2013 09:53:14PM *  0 points [-]

What does a more sophisticated version of EDT, taking the above observations into account, look like? I don't know. I suspect that it looks like some version of TDT / UDT

When I suggested this in the post of mine that you referenced, benelloitt pointed out that it fails the transparent-box variant of Newcomb's problem, where you can see the contents of the boxes, and Omega makes his decision based on what he predicts you would do if you saw $1 million in box A. I don't see an obvious way to rescue EDT in that scenario.

Comment author: Qiaochu_Yuan 09 July 2013 12:14:41AM *  0 points [-]

Again, I think it's difficult to claim that EDT does a particular thing in a particular scenario. An EDT agent who has a prior over causal networks with logical nodes describing the environment (including itself) and who updates this prior by acquiring information may approximate a TDT agent as it collects more information about the environment and its posterior becomes concentrated at the "true" causal network.

Comment author: AlexMennen 09 July 2013 01:39:02AM 0 points [-]

I'm not sure what you mean. Can you give an example of a probability distribution over causal networks that could be believed by an EDT agent in the transparent Newcomb's problem, such that the agent would one-box? Or at least give a plausibility argument for the existence of such a probability distribution?

Comment author: Qiaochu_Yuan 09 July 2013 02:57:20AM 1 point [-]

Maybe it's better not to talk about causal networks. Let's use an AIXI-like setup instead. The EDT agent starts with a Solomonoff prior over all computable functions that Omega could be. Part of the setup of Newcomb's problem is that Omega convinces you that it's a very good predictor, so some series of trials takes place in which the EDT agent updates its prior over what Omega is. The posterior will be concentrated at computable functions that are very good predictors. The EDT agent then reasons that if it two-boxes then Omega will predict this and it won't get a good payoff, so it one-boxes.

Comment author: AlexMennen 09 July 2013 06:18:42AM 1 point [-]

But in the transparent-box variant, the EDT agent knows exactly how much money is in box A before making its decision, so its beliefs about the contents of box A do not change when it updates on its counterfactual decision.

Comment author: Qiaochu_Yuan 09 July 2013 09:33:12AM 0 points [-]

Ah. I guess we're not allowing EDT to make precommitments?

Comment author: AlexMennen 09 July 2013 03:19:42PM 2 points [-]

We don't usually let decision theories make precommitments. That's why CDT fails Newcomb's problem. I think CDT and EDT both converge to something like TDT/UDT when allowed to precommit as far in advance as desirable.

Comment author: endoself 09 July 2013 09:39:37AM *  2 points [-]

If you want to change what you want, then you've decided that your first-orded preferences were bad. EDT recognizing that it can replace itself with a better decision theory is not the same as it getting the answer right; the thing that makes the decision is not EDT anymore.

Comment author: Jiro 08 July 2013 04:00:15PM 0 points [-]

Having a physical condition affect whether one smokes, while also posing a problem which implies that you can choose whether to smoke, suggests a variation of the problem: there's a brain lesion which increases your lifespan, but makes you incapable of computing conditional probabilities (plus some other effect that is enough for there to be a genuine question of how you should act). How should you behave in this version?

Comment author: Manfred 08 July 2013 01:40:04PM 0 points [-]

A "true" EDT agent needs to update on all the evidence they've ever observed, and it's very unclear to me how to do this in practice.

The only way I know how to explore what this means is to use simple toy problems and be very careful about never ever using the concept "reference class." Oh, and writing down algorithms helps stop you from sneaking in extra information.

Example algorithm (basic EDT):

We want to pick one action out of a list of possible actions (provided to us as a1, a2...), which can lead to various outcomes that we have preferences over, quantified by a utility (provided to us as [o1, u1], [o2, u2]...). To connect actions to outcomes we are provided a matrix of conditional probabilities P(outcome|action) (P(o1|a1), P(o1|a2)... P(o2|a1)... ...). We then assign expected utility to actions by summing over utilities weighted by the conditional probability of their outcomes, and pick the action with the highest expected utility.

Note that this algorithm is identical to the analogous CDT algorithm. The difference is a cosmetic variable name change. Thus if we want to differentiate between them at all, we'll need a second algorithm to feed our first one the matrix P(o|a).

The second algorithm for basic EDT has the form:

Start with some information Z about yourself. For an agent described by Z, use a probability-calculating program to find P(o|aZ) for that agent. (Note that this second algorithm could have contained the concept "reference class," but doesn't, and thus will actually be useful)

Comment author: Khoth 08 July 2013 07:01:56AM 0 points [-]

I think I prefer the "throwing away expensive things" formulation to the "smoking lesion" formulation.

In the smoking lesion, it's not clear whether the lesion causes smoking by modifying your preferences or modifying your decision algorithm. But if it's the latter, asking "what would decision theory X do?" is pointless since people with the lesion aren't using decision theory X. And if it's due to preferences, you already know you have the lesion when you get to the part of the problem that says you'd prefer to smoke.

So actually it's like your throwing-things-away problem, except that you can look at your bank balance, except obfuscated behind a layer of free-will-like confusion.

Comment author: itaibn0 09 July 2013 11:11:06PM 0 points [-]

I have made similar remarks in a comment here:

I would like to say that I agree with the arguments presented in this post, even though the OP eventually retracted them. I think the arguments for why EDT leads to the wrong decision are themselves wrong.

As mentioned by others, EY referred to this argument as the 'tickle defense' in section 9.1 of his TDT paper. I am not defending the advocates which EY attacked, since (assuming EY hasn't misrepresented them) they have made some mistakes of their own. In particular they argue for two-boxing.

I will start by talking about the ability to introspect. Imagine God promised Solomon that Solomon won't be overthrown. Then the decision of weather or not to sleep with other men's wives is easy, and Solomon can just act on his preferences. Yet if Solomon can't introspect then in the original situation he doesn't know weather he prefers sleeping with others' wives or not. So Solomon not being able to introspect means that there is information that he can rationally react to in some situations and not in others. While a problems like that can occur in real people, I don't expect a theory of rational behavior to have to deal with them. So I assume an agent knows what its preferences are, or if not fails to act on them in consistently.

In fact, the meta-tickle defense doesn't really deal with lack of introspection either. It assumes an agent can think about an issue and 'decide' on it, only to not act on that decision but rather to use that 'decision' as information. An agent that really couldn't introspect wouldn't be able to do that.

The tickle defense has been used to defend two-boxing. While this argument isn't mentioned in the paper, it is described in one of the comments here. This argument has been rebutted by the original poster AlexMennen. I would like to add to that something: For an agent to find out for sure weather it is a one-boxer or a two-boxer, the agent must make a complete simulation of itself in Newcomb's problem. If they try to find this out as part of their strategy for Newcomb's problem, they will get into an infinite loop.

benelliott raised a final argument here. He postulated that charisma is not related to preference for screwing wives, but rather to weather a king's reasoning would lead them to actually do it. Here I have to question weather the hypothetical situation makes sense. For real people an intrinsic personality trait might change their bottom line conclusion, but this behavior is irrational. A ideal rational agent cannot have a trait of the form charisma is postulated to have. benelliott also left the possibility the populace have Omega-like abilities, but then situation is really just another form of Newcomb's problem, and the rational choice is to not screw wives.

Overall I think that EDT actually does lead to rational behavior in these sorts of situations. In fact I think it is better than TDT, because TDT relies on computations with one right answer to not only have probabilities and correlations, but also on there being causality between them. I am unconvinced of this and unsatisfied with the various attempts to deal with it.

Sadly, this was in a fairly obscure post and the arguments failed to percolate to the lesswrong community.

Comment author: shminux 08 July 2013 08:06:15PM -1 points [-]

Upvoted for the ad absurdum examples. They highlight the essential bit of information (common cause) being thrown out by the naive EDT. Just like the naive CDT throws out the essential bit of information (Omega is always right, therefore two-boxing is guaranteed to result in zero payout) in Newcomb.

As for the reference class, knowing the common cause with certainty means that either you have some metaphysical access to the inside of the smoking lesion problem setup, in which case EDT is a wrong tool to use, or that there have been enough experiments to assign high probability to this common cause, probably through random placebo controlled double blind studies, which would then form your reference class(es).

Comment author: Manfred 08 July 2013 09:57:56PM *  0 points [-]

Just like the naive CDT throws out the essential bit of information (Omega is always right, therefore two-boxing is guaranteed to result in zero payout) in Newcomb.

Hmm, I don't think hat's quite the key point. For example, what about the absentminded driver problem?

My attempt would be "the process that decides where the money is the same as the process as the choice you make - you have just one independent decision to choose them with."

(Cool trick - you get the correct answer to the absent-minded driver problem (mostly) - even post-updates - if you make the probability of being at the different intersections depend in the obvious way on your probability of continuing when maximizing expected utility)

Comment author: shminux 08 July 2013 10:40:15PM -1 points [-]

Can't say I follow... As for the absentminded driver, I thought reflective consistency takes care of it (you don't recalculate your probabilities on the fly in absence of any new information).

Comment author: Manfred 08 July 2013 10:49:45PM 0 points [-]

The absentminded driver learns something when they learn they are at an intersection. The bits of information they get from the enviroment enable them to distinguish between intersection and non-intersection situations, at least :P

Comment author: shminux 08 July 2013 11:21:00PM 0 points [-]

I don't believe the driver learns anything new at an intersection. She knows the map and the payout in advance, there is not a single bit of information at an intersection that requires any decision making not already done before the start. The absentmindedness part means that the calculation is repeated at each intersection, but it's the exact same calculation.