https://www.youtube.com/watch?v=rGNINCggokM
Email me if you want slides. Also email me if you want to know how interventionists think about CDT (or if you want to know how I think we should attack "exotic" scenarios).
TLDR: CDT fails on Newcomb because it's not properly representing the situation. EDT also doesn't properly represent Newcomb type problems, and will fail on similar problems for this reason.
edit: You can play a drinking game where you take a drink whenever I say "the point is." :)
Ok, I still need to actually find a spare hour to sit down and watch that talk of yours, but the more I think about even your words here, the more I agree with you.
I think CDT might well be the correct decision theory. The correlation between Omega's prediction of us (as represented in TDT or CDT+E) and our actual choice is not a matter of decision-making, it's a matter of our beliefs about the world. EDT thus wins at Newcomb's Problem because it uses a full joint probability distribution, handling both correlation and causation, to represent its beliefs, whereas CDT is "losing" because it has no way to represent beliefs about correlation as separate from "pure" causation. Since I'm way behind on learning the math and haven't studied Judea Pearl's textbook yet, is there a form of causal graph that either natively includes or can be augmented with bidirectional correlation edges?
In real life, the correlations wouldn't even have to be "identity functions" (causing two correlated nodes in the graph to take on the exact same value), they could be any form of invertible function learned by any kind of regression analysis.
We could then apply a simple form of causal decision theory in which part of tracing the causal effects of our potential action is to transmit information about our decision across correlation arrows, up and down the causal graph.
Such a theory would then behave like TDT or CDT+E while being much more mathematically powerful in terms of the correlative beliefs it could discover and represent.
Since I'm way behind on learning the math and haven't studied Judea Pearl's textbook yet, is there a form of causal graph that either natively includes or can be augmented with bidirectional correlation edges?
Sure is, but you have to be careful. You can draw whatever type of edge you want, the trick is to carefully define what the particular type of edge means (or to be more precise you have to define what an absence of a particular type of edge means).
Generally Pearl et al. use a bidirected edge A <-> B to mean "there exists some hidden common cause(s) of A and B that I don't want to bother to draw," e.g. the real graph is A <- H -> B, where H is hidden. Or possibly there are multiple H nodes... Or, again more precisely, the absence of such an edge means there are no such hidden common causes. I use these sorts of graphs in my talk, my papers, my thesis, etc. They are called latent projections in Verma and Pearl 1990, and some people call this type of graph an ADMG (an acyclic directed mixed graph).
I am not entirely clear on what edge you want, maybe you want an edge to denote a deterministic constraint between nodes. That is also possible, I think there is D-separation (capital D) in Dan Geiger's thesis that handles these. Most of this has been worked out in late 80s early 90s.
Even in a simple 4 node graph you can have different type of correlation structure. For example:
A -> B <-> C <- D
denotes an independence model where
A is independent of D
A is independent of C given D
B is independent of D given A
This generally corresponds to a hidden common cause between B and C. (*)
We can also have:
A -> B -- C <- D
This corresponds to an independence model:
A is independent of D
A is independent of C given B and D
B is independent of D given A and C
This does not correspond to a hidden common cause of B and C, but to an equilibrium distribution of a feedback process between B and C under fixed values A and D. These types of graphs are known as "chain graphs" and were developed by a fellow at Oxford named Steffan Lauritzen.
You may also have something like this:
A -> B -> S <- C <- D
where S is a common effect of B and C that attains some specific value but isn't recorded. This corresponds to an independence model
A is independent of C and D given B
D is independent of A and B given C
This case corresponds to outcome dependent sampling (e.g. when people do case-control studies for rare diseases where they select one arm of a trial among those who are already sick -- the sample isn't random). This independent model actually corresponds to an undirected graphical model (Markov random field), because of the way conditioning on a node affects the node's ancestors in the graph.
(*) But not always. We can set up a quantum mechanical experiment that mirrors the above graph, and then note that in any hidden variable DAG with an H instead of a <-> edge, there is an inequality constraint that must hold on p(A,B,C,D). In fact, this inequality is violated experimentally, which means there is no hidden variable H in quantum mechanics... or some other seemingly innocuous assumption is not right.
So sometimes we can draw <-> simply to denote a conditional independence model that resembles those you get from a DAG with unobserved variables .... except Nature is annoying and doesn't actually have any underlying DAG.
If you are confused by this, you are in good company! I am still thinking very hard about what this means.
edit: Mysterious comment just for fun: it is sufficient to have a graph with -> edges, <-> edges in the Pearl sense, and -- edges in the Lauritzen sense that are "closed" with respect to "interesting" operations. "Closed" means we apply an operation and stay in the graph class: DAGs aren't closed under marginalizations, if we marginalize a DAG we sometimes get something that isn't a DAG. An "interesting" operation would be like conditioning: we can get independence after conditioning, which reduces the dimension of a model (less parameters needed if there is independence).
So sometimes we can draw <-> simply to denote a conditional independence model that resembles those you get from a DAG with unobserved variables .... except Nature is annoying and doesn't actually have any underlying DAG.
If you are confused by this, you are in good company! I am still thinking very hard about what this means.
Strangely enough, I'm not confused by it, as until someone reduces quantum mechanics to some lower-level non-quantum physics (which, apparently is something a few people are actually working on), I've just gone and accepted that the real causative agent in Nature is a joint probability distribution that is allowed to set a whole tuple of nonlocal outcome variables as it evolves.
But anyway, yes, this means that's roughly the kind of "correlation arrow" I think should be drawn in a CDT causal graph to handle Newcomblike problems, with CDT being just very slightly modified to actually make use of those correlative arrows in setting its decision.
That would get us at least as far as CDT+E does, while also reducing the problem of discovering the "entanglements" to actually just learning correct beliefs about correlative arrows, hidden variables or no hidden variables.
I would again like to hear what's going on in the Counterfactual Mugging, as that looks like the first situation we cannot actually beat by learning correct causative and correlative beliefs, and then applying a proper "Causal and Correlative" Decision Theory.
Anyway, sometime this evening or something I'm going to watch your lecture, and email you for the slides.
Excellent and clear article.
Two comments: Using a Time Lord as Omega seems to introduce possible confusion (did Omega actually go to the future to check?), the classic version I think relies on a perfect prediction algorithm.
Botworld may be a good place to test out a CDT+E decision theory agent. Having the source code of an agent exposed to other agents is a way to entangle decisions, given the right setup.
Using a Time Lord as Omega seems to introduce possible confusion (did Omega actually go to the future to check?),
Bad joke?
Botworld may be a good place to test out a CDT+E decision theory agent
Just about what I'm looking for, aside from philh having said:
Implementing either of these algorithms [TDT or CDT] in general is beyond our current abilities.
Dang. I can take a look at contributing?
CDT, with the right graph, one-boxes. See Spohn 2012 (hosted by lukeprog over here).
I do think this is a step towards an algorithmic way to make the right graph. But I have a problem with this part:
Let us assume that the causal graph given to the agent contains three logical nodes: the actual agent making its choice to pay Omega $100, Omega's prediction of what the agent will do in this case, and Omega's imagination of the agent receiving $1,000 had the coin come up the other way.
From where do those three logical nodes come from? And it looks to me like we're not actually using the last one- am I not also entangled with agents in universes where Omega is lying about whether or not it would have provided me with $1,000, and in those cases, shouldn't I refuse to give it $100?
That is, there seems to me to be a difference between logical uncertainty and indexical uncertainty. It makes sense to entangle across indexical uncertainty, but it doesn't make sense to entangle across logical uncertainty.
And it looks to me like we're not actually using the last one- am I not also entangled with agents in universes where Omega is lying about whether or not it would have provided me with $1,000, and in those cases, shouldn't I refuse to give it $100?
I found that handling the Counterfactual Mugging "correctly" (according to Eliezer's intuitive argument of retroactively acting on rational precommitments) requires different machinery from other problems. You're right that we don't seem to be "using" the last one, if we act under weak entanglement, and won't pay Omega $100.
The problem is that in Eliezer's original specification of the problem, he explicitly noted that, unknown to us as the player, the coin is basically weighted. Omega isn't a liar, but there aren't even any significant quantity of MWI timelines in which the coin comes up heads and Parallel!Us actually receives the money. We're trying to decide the scenario in a way that favors a version of our agent who never exists outside Omega's imagination.
I understand the notion behind this - act now according to precommitments it would have been rational to make in the past - but my own intuitions label giving Omega the money an outright loss of $100 with no real purpose, given the knowledge that the coin cannot come up heads.
This might just mean I have badly-trained intuitions! After all, if I switch mental "scenarios" to Omega being not merely a friendly superintelligence or Time Lord but an actual Trickster Matrix Lord, then all of a sudden it seems plausible that I am the prediction copy, and that "real me" might still have a chance at $1000, and I should thus pay Omega my imaginary and worthless simulated money.
The problem is, that presupposes my being willing to believe in some other universe entirely outside my own (ie: outside the simulation) in which Omega's claim to have already flipped the coin and gotten tails is simply not true. It makes Omega at least a partial liar. It confuses the hell out of me, personally.
Another version of the entanglement proposition might be able to handle this, but it sacrifices the transitivity of entanglement (to what loss, I haven't found out):
Inductive entangled {Beliefs Decision Action} (a1 a2: Agent Beliefs Decision Action) d1 d2 :=
| ent : (forall (b: Beliefs), a1 b d1 = a2 b d1 /\ a1 b d2 = a2 b d2) -> entangled a1 a2 d1 d2.
On the upside, unlike "strong entanglement", it won't trivially lose on the Prisoners' Dilemma.
That is, there seems to me to be a difference between logical uncertainty and indexical uncertainty. It makes sense to entangle across indexical uncertainty, but it doesn't make sense to entangle across logical uncertainty.
Assume that the causal Bayes nets given as input to our decision algorithm contain only indexical uncertainty.
I do think this is a step towards an algorithmic way to make the right graph.
It's an interesting question where the wrong graph would ever come from in the first place, given that we can not observe causation directly. If we are to run a bunch of copies of AIXI, for example, connected to a bunch of robotic arms, and let it observe arms moving in unison, each will learn that it controls all the arms. Representation of all the arms motions as independent would require extra data.
CDT, with the right graph, one-boxes. See Spohn 2012
I think Spohn also qualifies as an extension of CDT. It's been remarked before that Spohn's "intention nodes" are very similar to EY's "logical nodes" and by transitivity also CDT+E.
I think Spohn also qualifies as an extension of CDT.
Disagreed. By CDT I mean calculating utilities using:
=\sum_jP(O_j%7Cdo(A))D(O_j))
(The only modification from the wikipedia article is that I'm using Pearl's clearer notation for P(A>Oj).)
The naive CDT setup for Newcomb's problem has a causal graph which looks like B->M<-P, where B is your boxing decision, P is Omega's prediction, and M is the monetary reward you receive. This causal graph disagrees with the problem statement, as it necessarily implies that B and P are unconditionally independent, which we know is not the case from the assumption that Omega is a perfect predictor. The causal graph that agrees with the problem statement is B->P->M and B->M, in which case one-boxing is trivially the right action.
The bulk of Spohn's paper is all about how to get over the fear of backwards causation in hypothetical scenarios which explicitly allow backwards causation. You can call that an extension if you want, but it seems to me that's all in the counterfactual reasoning module, not in the decision-making module. (That is, CDT does not describe how you come up with P(Oj|do(A)), only what you do with it once you have it.)
Uh, doesn't the naive CDT setup for Newcomb's problem normally include a "my innards" node that has arrows going to both B and P? It's that that introduces the unconditional dependence between B and P. Obviously "B -> M <- P" by itself can't even express the problem because it can't represent Omega making any prediction at all.
Uh, doesn't the naive CDT setup for Newcomb's problem normally include a "my innards" node that has arrows going to both B and P?
If you decide what your innards are, and not what your action is, then this matches the problem description. If you can somehow have dishonest innards (Omega thinks I'm a one-boxer, then I can two-box), then this again violates the perfect prediction assumption.
I believe, as an empirical question, the first explicitly CDT accounts of Newcomb's problem did not use graphs, but if you convert their argument into a graph, it implicitly assumes "B -> M <- P."
If you can somehow have dishonest innards (Omega thinks I'm a one-boxer, then I can two-box), then this again violates the perfect prediction assumption.
Isn't the whole point of CDT that you cut any arrows from ancestor nodes with do(A) where A is your "intervention"? Obviously you can't have your innards imply your action if you explicitly violate that connection by describing your decision as an intervention.
Here is how I understood typical CDT accounts of Newcomb's problem: You have a graph given by B <- Innards -> P
and B -> M <- P
. Innards
starts with some arbitrary prior probability since you don't know your decision beforehand. You perturb the graph by deleting Innards -> B
in order to calculate p(M | do(B))
, and in doing so you end up with a graph "looking like" B -> M <- P
. Then the usual "dominance" arguments determine the decision regardless of the prior probability on Innards
.
Of course, after doing this analysis and coming up with a decision you now know (unconditionally) the value of B
and therefore Innards
, so arguably the probabilities for those should be set to 1 or 0 as appropriate in the original graph. This is generally interpreted by CDTists as a proof that this agent always two-boxes, and always gets the smaller reward.
Isn't the whole point of CDT that you cut any arrows from ancestor nodes with do(A) where A is your "intervention"?
Yes. My point is that when you have a supernatural Omega, then putting any of Omega's actions in ancestor nodes of your decisions, instead of descendant nodes of your decisions, is a mistake that violates the problem description.
But if you don't delete the incoming arches on your decision nodes then it isn't CDT anymore, it's just EDT.
Which begs the question of why we should bother with CDT in the first place.
Some people claim that EDT fails at "smoking lesion" type of problems, but I think it is due to incorrect modelling or underspecification of the problem. If you use the correct model EDT produces the "right" answer.
It seems to me that EDT is superior to CDT.
(Ilya Shpitser will disagree, but I never understood his arguments)
People have known how to deal with smoking lesion (under a different name) since the 18th century (hint: the solution is not the EDT solution):
http://www.e-publications.org/ims/submission/STS/user/submissionFile/12809?confirm=bbb928f0
The trick is to construct a system that deals with things 20 times more complicated than smoking lesion. That system is recent, and you will have to read e.g. my thesis, or Jin Tian's thesis, or elsewhere to see what it is.
I have yet to see anyone advocating EDT actually handle a complicated example correctly. Or even a simple tricky example, e.g. the front door case.
But if you don't delete the incoming arches on your decision nodes then it isn't CDT anymore, it's just EDT.
You still delete incoming arcs when you make a decision. The argument is that if Omega perfectly predicts your decision, then causally his prediction must be a descendant of your decision, rather than an ancestor, because if it were an ancestor you would sever the connection that is still solid (and thus violate the problem description).
(Ilya Shpitser will disagree, but I never understood his arguments)
This is a shame, because he's right. Here's my brief attempt at an explanation of the difference between the two:
EDT uses the joint probability distribution. If you want to express a joint probability distribution as a graphical Bayesian network, then the direction of the arrows doesn't matter (modulo some consistency concerns). If you utilize your human intelligence, you might be able to figure out "okay, for this particular action, we condition on X but not on Y," but you do this for intuitive reasons that may be hard to formalize and which you might get wrong. When you use the joint probability distribution, you inherently assume that all correlation is causation, unless you've specifically added a node or data to block causation for any particular correlation.
CDT uses the causal network, where the direction of the arrows is informative. You can tell the difference between altering and observing something, in that observations condition things both up and down the causal graph, whereas alterations only condition things down the causal graph. You only need to use your human intelligence to build the right graph, and then the math can take over from there. For example, consider price controls: there's a difference between observing that the price of an ounce of gold is $100 and altering the price of an ounce of gold to be $100. And causal networks allow you to answer questions like "given that the price of gold is observed to be $100, what will happen when we force the price of gold to be $120?"
Now, if you look at the math, you can see a way to embed a causal network in a network without causation. So we could use more complicated networks and let conditioning on nodes do the graph severing for us. I think this is a terrible idea, both philosophically and computationally, because it entails more work and less clarity, both of which are changes in the wrong direction.
You still delete incoming arcs when you make a decision. The argument is that if Omega perfectly predicts your decision, then causally his prediction must be a descendant of your decision, rather than an ancestor, because if it were an ancestor you would sever the connection that is still solid (and thus violate the problem description).
If I understand correctly, in causal networks the orientation of the arches must respect "physical causality", which I roughly understand to mean consistency with the thermodynamical arrow of time.
There is no way for your action to cause Omega's prediction in this sense, unless time travel is involved.
EDT uses the joint probability distribution. If you want to express a joint probability distribution as a graphical Bayesian network, then the direction of the arrows doesn't matter (modulo some consistency concerns).
Yes, different Bayesian networks can represent the same probability distribution. And why would that be a problem? The probability distribution and your utility function are all that matters.
When you use the joint probability distribution, you inherently assume that all correlation is causation, unless you've specifically added a node or data to block causation for any particular correlation.
"Correlation vs causation" is an epistemic error. If you are making it then you are using the wrong probability distribution, not a "wrong" factorization of the correct probability distribution.
If I understand correctly, in causal networks the orientation of the arches must respect "physical causality", which I roughly understand to mean consistency with the thermodynamical arrow of time.
In the real world, this is correct, but it is not mathematically necessary. (To go up a meta level, this is about how you build causal networks in the first place, not about how you reason once you have a causal network; even if philosophers were right about CDT as the method to go from causal networks to decisions, they seem to have been confused about the method by which one goes from English problem statements to causal networks when it comes to Newcomb's problem.)
unless time travel is involved.
It is. How else can Omega be a perfect predictor? (I may be stretching the language, but I count Laplace's Demon as a time traveler, since it can 'see' the world at any time, even though it can only affect the world at the time that it's at.)
Yes, different Bayesian networks can represent the same probability distribution. And why would that be a problem?
The problem is that you can't put any meaning into the direction of the arrows because they're arbitrary.
"Correlation vs causation" is an epistemic error. If you are making it then you are using the wrong probability distribution, not a "wrong" factorization of the correct probability distribution.
If you give me a causal diagram and the embedded probabilities for the environment, and ask me to predict what would happen if you did action A (i.e. counterfactual reasoning), you've already given me all I need to calculate the probabilities of any of the other nodes you might be interested in, for any action included in the environment description.
If you give me a joint probability distribution for the environment, and ask me to predict what would happen if you did action A, I don't have enough information to calculate the probabilities of the other nodes. You need to give me a different joint probability distribution for every possible action you could take. This requires a painful amount of communication, but possibly worse is that there's no obvious type difference between the joint probability distribution for the environment and for the environment given a particular action--and if I calculate the consequences of an action given the whole environment's data, I can get it wrong.
In the real world, this is correct, but it is not mathematically necessary.
If you take physical causality out of the picture, then the arches orientation is underspecified in the general case. But then, since you are only allowed to cut arches that are incoming to the decision nodes, your decision model will be underspecified.
It is. How else can Omega be a perfect predictor?
If you are going to allow time travel, defined in a broad sense, then your casual network will have cycles.
The problem is that you can't put any meaning into the direction of the arrows because they're arbitrary.
But the point is that in EDT you don't care about the direction of the arrows.
If you give me a causal diagram and the embedded probabilities for the environment, and ask me to predict what would happen if you did action A (i.e. counterfactual reasoning), you've already given me all I need to calculate the probabilities of any of the other nodes you might be interested in, for any action included in the environment description.
If I give you a casual diagram for Newcomb's problem (or some variation of thereof) you will make a wrong prediction, because causal diagrams can't properly represent it.
If you give me a joint probability distribution for the environment, and ask me to predict what would happen if you did action A, I don't have enough information to calculate the probabilities of the other nodes.
If the model includes the myself as well as the environment, you will be able to make the correct prediction.
Of course, if you give this prediction back to me, and it influences my decision, then the model has to include you as well. Which may, in principle, cause Godelian self-reference issues. But that's a fundamental limit of the logic capabilities of any computable system, there are no easy ways around it.
But that's not as bad as it sounds: the fact that you can't precisely predict everything about yourself doesn't mean that you can't predict anything or that you can't make approximate predictions.
(for instance, GCC can compile and optimize GCC)
Causal decision models are one way to approximate hard decision problems, and they work well in many practical cases. Newcomb-like scenarios are specifically designed to make them fail.
But the point is that in EDT you don't care about the direction of the arrows.
Yes, and because EDT does not assign meaning to the direction of the arrows is why it's a less powerful language for describing environments.
If I give you a casual diagram for Newcomb's problem (or some variation of thereof) you will make a wrong prediction, because causal diagrams can't properly represent it.
If you allow retrocausation, I don't see why you think this is the case.
Yes, and because EDT does not assign meaning to the direction of the arrows is why it's a less powerful language for describing environments.
I'm not convinced that this is the case.
Arrow orientation is an artifact of Bayesian networks, not a funamental property of the world.
Arrow orientation is an artifact of Bayesian networks, not a funamental property of the world.
! Causation going in one direction (if the nodes are properly defined) does appear to be a fundamental property of the real world.
I'm not sure what we are disagreeing about.
In CDT you need causal Bayesian networks where the arrow orientation reflects physical causality.
In EDT you just need probability distributions. You can represent them as Bayesian networks, but in this case arrow direction doesn't matter, up to certain consistency constraints.
Why would EDT not having causal arrows be a problem?
Why would EDT not having causal arrows be a problem?
Because the point of making decisions is to cause things to happen, and so encoding information about causality is a good idea.
Disagree. The directionality of causation appears to be a consequence of the Second Law of Thermodynamics, which is not a fundamental law.
All the microscopic laws are completely compatible with there being a region of space-time more or less like ours, but in reverse, with entropy decreasing monotonically. In fact, in a sufficiently large world, such a region is to be expected, since the Second Law is probabilistic. In this region, matches will light before (from our perspective) they are struck, and ripples in a pond will coalesce to a single point and eject a rock from the pond. If we use nodes similar to the ones we do in our environment, then in order to preserve the Causal Markov Condition, we would have to draw arrows in the opposite temporal direction.
Causation is not a useful concept when we're talking about the fundamental level of nature, precisely because all fundamental interactions (with some very obscure exceptions) are completely time-symmetric. Causation (and the whole DAG framework) becomes useful when we move to the macroscopic world of temporally asymmetric phenomena. And the temporal asymmetry is just a manifestation of the Second Law.
Causation is not a useful concept when we're talking about the fundamental level of nature, precisely because all fundamental interactions (with some very obscure exceptions) are completely time-symmetric.
Assuming CPT symmetry, the very reason why there's still matter in the universe (as opposed to it all having annihilated with antimatter) in the first place must be one of those very obscure exceptions.
It's true that CP-violations appear to be a necessary condition for the baryon asymmetry (if you make certain natural-seeming assumptions). It's another question whether the observed CP-violations are sufficient for the asymmetry, if the other Sakharov conditions are met. And one of the open problems in contemporary cosmology is precisely that they don't appear to be sufficient, that the subtle CP-violations we have observed so far (only in four types of mesons) are too subtle to account for the huge asymmetry between matter and anti-matter. They would only account for a tiny amount of that asymmetry. So, yeah, the actual violations of T-symmetry we see are in fact obscure exceptions. They are not sufficient to account for either the pervasive time asymmetry of macroscopic phenomena or the pervasive baryon asymmetry at the microscopic level. There are two ways to go from here: either there must be much more significant CP-violations that we haven't yet been able to observe, or the whole Sakharov approach of accounting for the baryon asymmetry dynamically is wrong, and we have to turn to another kind of explanation (anthropic, maybe?). The latter option is what we have settled on when it comes to time asymmetry -- we have realized that a fundamental single-universe dynamical explanation for the Second Law is not on the cards -- and it may well turn out to be the right option for the baryon asymmetry as well.
It's also worth noting that CP-violations by themselves would be insufficient to account for the asymmetry, even if they were less obscure than they appear to be. You also need the Second Law of Thermodynamics (this is the third Sakharov condition). In thermodynamic equilibrium any imbalance between matter and anti-matter generated by CP-violating interactions would be undone.
In any case, even if it turns out that CP-violating interactions are plentiful enough to account for the baryon asymmetry, they still could not possibly account for macroscopic temporal asymmetry. The particular sort of temporal asymmetry we see in the macroscopic world involves the disappearance of macroscopically available information. Microscopic CP-violations are information-preserving (they are CPT symmetric), so they cannot account for this type of asymmetry. If there is going to be a fundamental explanation for the arrow of time it would have to involve laws that don't preserve information. The only serious candidate for this so far is (real, not instrumental) wavefunction collapse, and we all know how that theory is regarded around these parts.
I should make clear that by 'fundamental' I was not speaking in terms of physics, but in terms of decision theory, where causation does seem to be of central importance.
If we use nodes similar to the ones we do in our environment, then in order to preserve the Causal Markov Condition, we would have to draw arrows in the opposite temporal direction.
This reads to me like "conditioning on us being in a weird part of the universe where less likely events are more likely, then when we apply the assumption that we're in a normal part of the universe where more likely events are more likely we get weird results." And, yes, I agree with that reading, and I'm not sure what you want that to imply.
I wanted to imply that the temporal directionality of causation is a consequence of the Second Law of Thermodynamics. I guess the point would be that the "less likely" and "more likely" in your gloss are only correct if you restrict yourself to a macroscopic level of description. Described microscopically, both regions are equally likely, according to standard statistical mechanics. This is related to the idea that non-fundamental macroscopic factors make a difference when it comes to the direction of causal influence.
But yeah, this was based on misreading your use of "fundamental" as referring to physical fundamentality. If you meant decision-theoretically fundamental, then I agree with you. I thought you were espousing the Yudkowsky-esque line that causal relations are part of the fundamental furniture of the universe and that the Causal Markov Condition is deeper and more fundamental than the Second Law of Thermodynamics.
"Correlation vs causation" is an epistemic error. If you are making it then you are using the wrong probability distribution, not a "wrong" factorization of the correct probability distribution.
The point is here is that if you have the correct probability distribution, all its predictions will be correct (ie. have minimum expected regret). It seems that the difference between epistemology and decision theory can't be emphasized enough. If it's possible for your "mixing up correlation and causation" to result in you making an incorrect prediction and being surprised (when a different prediction would have been systematically more accurate), then there must be an error in your probability distribution.
If you give me a joint probability distribution for the environment, and ask me to predict what would happen if you did action A, I don't have enough information to calculate the probabilities of the other nodes.
But an arbitrary joint probability distribution can assign P(stuff | action=A)
to any values whatsoever. What stops you from just setting all conditional probabilities to the correct values (ie. those values such that they "predict what would happen if you did action A" correctly, which would be the output of P(stuff|do(A))
on the "correct" causal graph)?
And furthermore, if that joint distribution does make optimal predictions (assuming that this "counterfactual reasoning" results in optimal predictions, because I can't see any other reason you'd use a set of probabilities), then clearly it must be the probability distribution that is mandated by Cox's theorem, etc etc.
Note, there is a free variable in the above, which is the unconditional probabilities P(A)
. But as long as the optimal P(A)
values are all nonzero (which is the case if you don't know the agent's algorithm, for example), the optimality of the joint distribution requires P(stuff|A)
to be correct.
So it would seem like if you have the correct probablity distribution, you can predict what would happen if I did action A, by virtue of me giving you the answers. Unless I've made a fatal mistake in the above argument.
If it's possible for your "mixing up correlation and causation" to result in you making an incorrect prediction and being surprised (when a different prediction would have been systematically more accurate), then there must be an error in your probability distribution.
In the smoking lesion variant where smoking is actually protective against cancer, but not enough to overcome the damage done by the lesion (leading to a Simpson's Paradox), standard EDT recommends against smoking (because it increases your chance of having a lesion) and standard CDT recommends for smoking (because you sever the link to having a lesion, and so only the positive direct effect remains). They give different estimates of difference of probability of getting cancer given that you chose to start smoking and the probability of getting cancer given that you chose to not smoke, because EDT doesn't natively understand the difference between "are a smoker" and "chose to start smoking." If you understand the difference, you can fudge things so that EDT works while you're actively putting effort into it.
But an arbitrary joint probability distribution can assign P(stuff | action=A) to any values whatsoever.
This is correct. You can remove the causality from a causal network and just use EDT on a joint probability distribution at the cost of increasing the number of nodes and the fan-in for each node. Since the memory requirements are exponential in fan-in and linear in number of nodes, this is a bad idea.
Besides the memory requirements, this adds another problem: in a causal network, we share parameters that are not shared in the 'decaused' network. This is necessary in order to be able to represent all possible mutilated graphs as marginals of the joint probability distribution, but means that if we're trying to learn the parameters from observational data instead of getting from another source, we need much more data to get estimates that are as good. We can apply equality constraints, but then we might as well use CDT because we're either using the equality constraints implied by CDT (and are thus correct) or we screwed something up.
There also seem to be numerous philosophical benefits to using the language of counterfactuals and conditionals, over just the language of conditionals. Causal networks really are more powerful, in the sense that Paul Graham describes here.
So it would seem like if you have the correct probablity distribution, you can predict what would happen if I did action A, by virtue of me giving you the answers.
If you give me a joint probability distribution which I can marginalize over any possible action, yes, I can do those predictions because you gave me the answers.
But what use is an algorithm that, when you give it the answers, merely doesn't destroy them? We want something that takes environments as inputs and outputs decisions as outputs, because then it will do the work for us.
In the smoking lesion variant where smoking is actually protective against cancer, but not enough to overcome the damage done by the lesion ...
I tend to be sceptical of smoking lesion arguments on account of how the scenario seems be always either underspecified or contradictory. For example, how can any agents in the smoking lesion problem be EDT agents at all?
If they always take the action recommended by EDT, and there is exactly one such action, then they must all take the same action. But in that case there can't possibly be the postulated connection between the lesion and smoking (conditional on being an EDT agent). So an EDT agent that knows it implements EDT can't believe that its decision to smoke affects the chances of having the lesion, on pain of making incorrect predictions.
On the other hand, if "EDT agents" in this problem only sometimes take the action recommended by EDT, and the rest of the time are somehow influenced by the presence or absence of the lesion, then the description of the problem that says that the node controlled by your decision theory is "decision to smoke" would seem to be wrong to begin with. (These EDT agents will predict that P(I smoke | I smoke) = 1
and be horribly suprised.)
This is correct. You can remove the causality from a causal network and just use EDT on a joint probability distribution at the cost of increasing the number of nodes and the fan-in for each node. Since the memory requirements are exponential in fan-in and linear in number of nodes, this is a bad idea.
This is something I can believe, though it is not a correctness argument. Certainly it's plausible that in many scenarios it is computationally more convenient to apply CDT directly than to use a fully general model that has been taught about the same structure that CDT assumes.
For example, how can any agents in the smoking lesion problem be EDT agents at all?
In the statement of the smoking lesion problem I prefer, you have lots of observational data on people whose decision theory is unknown, but whose bodies are similar enough to yours that you think the things that give or don't give them cancer will have the same effect on you. You also don't know whether or not you have the lesion; a sensible prior is the population prevalence of the lesion.
Now it looks like we have a few options.
Option 1 is unworkable. Option 2 is what I call 'standard EDT,' and it fails on the smoking lesion. Option 3 is generally the one EDTers use to rescue EDT from the smoking lesion. But the issue is that EDT gives you no guidance on which of the correlations to break; you have to figure it out from the problem description. One might expect that sitting down and working out whether or not to smoke using math breaks the correlation between smoking and having the lesion, as most people don't do that. But should we also break the negative correlation between smoking and cancer conditional on lesion status? From the English names, we can probably get those right. If they're unlabeled columns in a matrix or nodes in a graph, we'll have trouble.
That work still has to be done somewhere, obviously; in CDT it's done when one condenses the problem statement down to a causal network. (And CDTers historically being wrong on Newcomb's is an example of what doing this work wrong looks like.) But putting work where it belongs and having good interfaces between your modules is a good idea, and I think this is a place where CDT does solidly better than EDT.
Certainly it's plausible that in many scenarios it is computationally more convenient to apply CDT directly than to use a fully general model that has been taught about the same structure that CDT assumes.
I do think the linked Graham article is well worth reading; that all languages necessarily turn into machine code does not mean all languages are equally good for thinking in. Thinking in a more powerful language lets you have more powerful thoughts.
Smoking lesion is a problem with a logical contradiction in it. The decision is simultaneously a consequence of the lesion, and of the decision theory's output (but not one of it's inputs, such as e.g. the desire to smoke, in which case it's this desire that will correlate, and conditional on that desire, the decision itself won't).
edit: smoking lesion problem seems more interesting from psychological perspective. Perhaps it is difficult to detect internal contradictions within a hypothetical that asserts an untruth - any "this smells fishy" feeling is mis-attributed to the tension between the fact of how smoking kills and the hypothetical genetics.
It could, thus, be very useful to come up with a real world example instead of using such hypotheticals.
In traditional decision theory as proposed by bayesians such as Jaynes, you always condition on all observed data. The thing that tells you whether any of this observed data is actually relevant is your model, and it does this by outputting a joint probability distribution for your situation conditional on all that data. (What I mean by "model" here is expressed in the language of probability as a prior joint distribution P(your situation × dataset | model)
, or equivalently a conditional distribution P(your situation | dataset, model)
if you don't care about computing the prior probabilities of your data.)
Option 2 is what I call "blindly importing related historical data as if it was a true description of your situation". Clearly any model that says that the joint probability for your situation is identically equal to the empirical frequencies in any random data set is wrong.
From the English names, we can probably get those right. If they're unlabeled columns in a matrix or nodes in a graph, we'll have trouble.
The point is, it's not about figuring stuff out from English names. It's about having a model that correctly generalises from observed data to predictions. Unlabeled columns in a matrix are no trouble at all if your model relates them to the nodes in your personal situation in the right way.
The CDT solution of turning the problem into a causal graph and calculating probabilities with do(·)
is effectively just such a model, that admittedly happens to be an elegant and convenient one. Here the information that allows you to generalise from observed data to make personal predictions is introduced when you use your human intelligence to figure out a causal graph for the situation.
Still, none of this addresses the issue that the problem itself is underspecified.
ETA: Lest you think I've just said that CDT is better than EDT, the point I'm trying to make here is that if you want a decision theory to generalise from data, you need to provide a model. "Your situation has the same probabilities as a causal intervention on this causal graph on that dataset, where nodes {A, B, C, ...} match up to nodes {X, Y, Z, ...}" is as good a model as any, and can certainly be used in EDT. The fact that EDT doesn't come "model included" is a feature, not a bug.
Option 2 is what I call "blindly importing related historical data as if it was a true description of your situation". Clearly any model that says that the joint probability for your situation is identically equal to the empirical frequencies in any random data set is wrong.
Agreed that this is a bad idea. I think where we disagree is that I don't see EDT as discouraging this. It doesn't even throw a type error when you give it blindly imported related historical data! CDT encourages you to actually think about causality before making any decisions.
It's about having a model that correctly generalises from observed data to predictions.
Note that decision theory does actually serve a slightly different role from a general prediction module, because it should be built specifically for counterfactual reasoning. The five-and-ten argument seems to be an example of this: if while observing another agent, you see them choose $5 over $10, it could be reasonable to update towards them preferring $5 to $10. If considering the hypothetical situation where you choose $5 instead of $10, it does not make sense to update towards yourself preferring $5 to $10, or to draw whatever conclusion you like by the principle of explosion.
that admittedly happens to be an elegant and convenient one.
Given that you can emulate one system using the other, I think that elegance and convenience are the criteria we should use to choose between them. Note that emulating a joint probability without causal knowledge using a causal network is trivial- you just use undirected edges for any correlations- but emulating a causal network using a joint probability is difficult.
"Your situation has the same probabilities as a causal intervention on this causal graph on that dataset, where nodes {A, B, C, ...} match up to nodes {X, Y, Z, ...}" is as good a model as any, and can certainly be used in EDT. The fact that EDT doesn't come "model included" is a feature, not a bug.
Precisely.
Imagine, instead of the smoking lesion, a "death paradox lesion", Statistical analysis has shown that this lesion is associated with early death, and also that it is correlated with the ability of the agent to make correct logical decisions.
Assume you don't want an early death. Should you conclude that you have a death paradox lesion?
There's also the scenarion involving the EDT paradox lesion. This lesion is 1) correlated with early death, and 2) correlated with people's use of EDT in the same way that the smoking lesion is correlated with smoking. What do you conclude and why?
I don't understand most of your position on EDT/CDT, but I especially don't understand how
But in that case there can't possibly be the postulated connection between the lesion and smoking (conditional on being an EDT agent).
follows from the previous sentence.
I also thought P(A|A)=1 followed from the axioms of probability.
In the smoking lesion variant where smoking is actually protective against cancer, but not enough to overcome the damage done by the lesion (leading to a Simpson's Paradox), standard EDT recommends against smoking (because it increases your chance of having a lesion) and standard CDT recommends for smoking (because you sever the link to having a lesion, and so only the positive direct effect remains).
Smoking lesion problems are generally underspecified. If you can fill in additional detail, the "correct" decision changes. And I argue that a properly applied EDT outputs it.
Consider the scenario where the lesion affects your probabilty of smoking by affecting your conscious preferences.
The correct decision is smoke, and EDT outputs it if you condition on the preferences.
In another scenario, an evil Omega probes you before you are born. If and only if it predicts that you will be a smoker, it puts a cancer lesion in your DNA (Omega is a good, though not necessarily perfect predictor).
The cancer lesion doesn't directly "cause" smoking, or, in the language of probability theory, it doesn't correlate with smoking conditioned on Omega's prediction.
The correct decision is don't smoke, and EDT outputs it since the problem is exactly isomorphic to Newcomb's standard problem. CDT gets it wrong.
The argument is that if Omega perfectly predicts your decision, then causally his prediction must be a descendant of your decision
The problem is that this can lead to inconsistency when you have two omegas trying to predict each other.
The problem is that this can lead to inconsistency when you have two omegas trying to predict each other.
This is one of the arguments against the possibility of Laplace's Demon, and I agree that a world with two Omegas is probably going to be inconsistent.
It should be noted that this also makes transparent Newcomb ill-posed because the transparent boxes make the box-picker essentially an omega.
You say "disagreed" but then end up saying what I meant in the last paragraph.
Consider that I may have read Spohn before.
You say "disagreed" but then end up saying what I meant in the last paragraph
I think that we're arguing about whether the label CDT refers to just the utility calculation or the combination of the utility calculation and the counterfactual module, not about any of the math. I can go into the reasons why I like to separate those two out, but I think I've already covered the basics.
Consider that I may have read Spohn before.
I generally aim to include the audience when I write comments, which sometimes has the side effect of being insultingly basic to the person I'm responding to. Normally I'm more careful about including disclaimers to that effect, and I apologize for missing that this time.
From where do those three logical nodes come from? And it looks to me like we're not actually using the last one- am I not also entangled with agents in universes where Omega is lying about whether or not it would have provided me with $1,000, and in those cases, shouldn't I refuse to give it $100?
On further thought, I would like to see someone explain exactly why I should give Omega $100. I've heard it phrased as the retroactive following-through of rational precommitments, and I've also heard it phrased as reflective self-consistency, in the sense that even if the coin was guaranteed to come up tails this time, we should pay because we like having Omega offer us bets where the expected value is good in general (as long as 1/10 coins come up heads, we break even over time, more than that and we profit). The former case, I don't know how to handle. The latter case, I think we could represent using some form of CDT+E, or CDT over a causal-and-correlative model of beliefs.
On further thought, I would like to see someone explain exactly why I should give Omega $100.
Personally, I think all of the work is being done by Omega's super-trustworthiness, and so I don't think it's a reasonable scenario to optimize for. In the real world, making a 'rational precommitment' on information you don't possess seems like the reference class of 'scams.'
(Note that I am explictly avoiding the question of what the right thing to do is; I don't think my decision theory is currently well-equipped to handle this problem, and I'm okay with that.)
Ok, I've been talking it over with Benjamin Fox some more, and I don't think Omega's trustworthiness is the issue here. The issue is basically to come up with some decision-theoretic notion of "virtue": "I should take action X because, timelessly speaking, a history in which I always respond to choice Y with action X nets me more money/utility/happiness than any other." The idea is that taking action X or not doing so in any one particular instance can change which history we're enacting, while normal decision theories reason only over the scope of a single choice-instance, with little regard for potential futures about which we don't have specific information encoded in our causal graph.
The idea is that taking action X or not doing so in any one particular instance can change which history we're enacting, while normal decision theories reason only over the scope of a single choice-instance, with little regard for potential futures about which we don't have specific information encoded in our causal graph.
It seems to me that the impacts of being virtuous on one's potential future is enough to justify being virtuous, and one does not need to take into account the impacts of being virtuous on alternative presents one might have faced instead. (Basically, instead of trusting that Omega would have given you something in an alternate world, you are trusting that human society is perceptive enough to notice and reward enough of your virtues to justify having them.)
Yes, we agree. "I will get rewarded for this behavior in the future at a profitable rate to justify my sacrifice in the present" is a reason to "self-sacrifice" in the present. The question is how to build a decision-theory that can encode this kind of knowledge without requiring actual prescience (that is, without needing to predict the specific place and time in which the agent will be rewarded).
Even using that notion of virtue, whether giving Omega the $100 benefits you only happens if Omega is trustworthy. So Omega's trustworthiness can still be a deciding factor.
Omega's trustworthiness mostly just means we can assign a degenerate probability of 1.0 to all information we receive from Omega.
cousin_it, you might find this paper interesting:
http://www.hsph.harvard.edu/james-robins/files/2013/03/new-approach.pdf
In particular, Figures 3.1 and 3.2 very much resemble von Neumann's graphical representation of extensive-form games. The author of above told me he was not aware of von Neumann's stuff when he wrote it. I would like to extend extensive form games to handle confounding properly (which is what above reference is doing, in the context of longitudinal studies in epidemiology, e.g. games vs Nature).
I haven't thought about this carefully, but much of UDT stuff bothers me because it tries to extend EDT, and thus fails whenever confounding shows up.
I haven't seen that UDT paper, and will now consume it to gain its knowledge read it.
Author's Note: Please let me know in the comments exactly what important background material I have missed, and exactly what I have misunderstood, and please try not to mind that everything here is written in the academic voice.
Abstract: Timeless Decision Theory often seems like the correct way to handle many game-theoretical dilemmas, but has not quite been satisfactorily formalized and still handles certain problems the wrong way. We present an intuition that helps us extend Causal Decision Theory towards Timeless Decision Theory while adding rigor, and then formalize this intuition. Along the way, we describe how this intuition can guide both us and programmed agents in various Newcomblike games.
Introduction
One day, a Time Lord called Omega drops out of the sky, walks up to me on the street, and places two boxes in front of me. One of these is opaque, the other is transparent and contains $1000. He tells me I can take either the opaque box alone, or both boxes, but that if and only if he predicted using his Time Lord Science I would take just the opaque box, it contains $1,000,000. He then flies away back to the his home-world of Gallifrey. I know that whatever prediction he made was/will be correct, because after all he is a Time Lord.
The established, gold-standard algorithm of Causal Decision Theory fails to win the maximum available sum of money on this problem, just as it fails on a symmetrical one-shot Prisoner's Dilemma. In fact, as human beings, we can say that CDT fails miserably, because while a programmed agent goes "inside the game" and proceeds to earn a good deal less money than it could, we human observers are sitting outside, carefully drawing outcome tables that politely inform us of just how much money our programmed agents are leaving on the table. While purely philosophical controversies abound in the literature about the original Newcomb's Problem, it is generally obvious from our outcome tables in the Prisoners' Dilemma that "purely rational" CDT agents would very definitely benefit by cooperating, and that actual human beings asked to play the game calculate outcomes as if forming coalitions rather than as if maximizing personal utility -- thus cooperating and winning. Even in the philosophical debates, it is generally agreed that one-boxers in Newcomb's Problem are, in fact, obtaining more money.
While some have attempted to define rationality as the outputs of specific decision algorithms, we hold with the school of thought that rationality means minimizing regret: a rational agent should select its decision algorithms in order to win as much as it will know it could have won ex-post-facto. Failing perfection, this optimum should be approximated as closely as possible.
Yudkowsky's Timeless Decision Theory approaches this problem by noting that many so-called decisions are actually outcomes from concurrent or separated instantiations of a single algorithm, that Timeless Decision Theory itself is exactly such an algorithm, and that many decisions (that actually are decisions in the sense that the algorithm deciding them is a utility-maximizing decision-theory) are acausally, timelessly connected. Agents running TDT will decide not as if they are determining one mere assignment to one mere variable in a causal graph but as if they're determining the output of the computation they implement, and thus of every logical node in the entire graph derived from their computation. However, it still has some kinks to work out:
The bolding is added by the present authors, as it highlights the issue we intend to address here. Terms like "timeless" and "acausal" have probably caused more confusion around Timeless Decision Theory than any other aspect of what is actually an understandable and reasonable algorithm. I will begin by presenting a clearer human-level intuition behind the correct behavior in Newcomb's Problem and the Prisoner's Dilemma, and will then proceed to formalize that intuition in Coq and apply it to sketch a more rigorously algorithmic Timeless Decision Theory. The formalization of this new intuition avoids problems of infinite self-reference or infinite recursion in reasoning about the algorithms determining decisions of oneself or others.
Timeless decisions are actually entangled with each-other
The kind of apparent retrocausality present in Newcomb's Problem makes no intuitive sense whatsoever. Not only our intuitions but all our knowledge of science tell us that (absent the dubious phenomenon of closed timelike curves) causal influences always and only flow from the past to the future, never the other way around. Nonetheless, in the case of Newcomb-like problems, it has been seriously argued that:
We do not believe in retrocausality, at least not as an objective feature of the world. Any subjectively apparent retrocausality, we believe, must be some sort of illusion that reduces to genuine, right-side-up causality. Timeless or acausal decision-making resolves the apparent retrocausality by noticing that different "agents" in Newcomblike problems are actually reproductions of the same algorithm, and that they can thus be logically correlated without any direct causal link.
We further prime our intuitions about Newcomb-like problems with the observation that CDT-wielding Newcomb players who bind themselves to a precommitment to one-box before Omega predicts their actions will win the $1,000,000:
At t = 0 you can take a pill that turns you into a “one boxer”. The pill will lead the mad scientist to predict (at t = ½) that you will take one box, and so will cause you to receive £1,000,000 but will also cause you to leave a free £1,000 on the table at t = 1. CDT tells you to take the pill at t = 0: it is obviously the act, among those available at t = 0, that has the best overall causal consequences.
The "paradox", then, lies in how the CDT agent comes to believe that their choice is completely detached from which box contains how much money, when in fact Omega's prediction of their choice was accurate, and directly caused Omega to place money in boxes accordingly, all of this despite no retrocausality occurring. Everything makes perfect sense prior to Omega's prediction.
What, then, goes wrong with CDT? CDT agents will attempt to cheat against Omega: to be predicted as a one-boxer and then actually take both boxes. If given a way to obtain more money by precommitting to one-boxing, they will do so, but will subsequently feel regret over having followed their precommitment and "irrationally" taken only one box when both contained money. They may even begin to complain about the presence or absence of free will, as if this could change the game and enable their strategy to actually work.
When we cease such protestations and accept that CDT behaves irrationally, the real question becomes: which outcomes are genuinely possible in Newcomb's Problem, which outcomes are preferable, and why does CDT fail to locate these?
Plainly if we believe that Omega has a negligible or even null error rate, then in fact only two outcomes are possible:
Plainly, $1 million is a greater sum than $1000, and the former outcome state is thus preferable to the latter. We require an algorithm that can search out and select this outcome based on general principles, in any Newcomblike game rather than based on special-case heuristics.
Whence, then, a causal explanation of what to do? The authors' intuition was sparked by a bit of reading about the famously "spooky" phenomenon of quantum entanglement, also sometimes theorized to involve retrocausality. Two particles interact and become entangled; from then on, their quantum states will remain correlated until measurement collapses the wave-function of one particle or the other. Neither party performing a measurement will ever be able to tell which measurement took place first in time, but both measurements will always yield correlated results. This occurs despite the fact that quantum theory is confirmed to have no hidden variables, and even when general relativity's light-speed limit on the transmission of information prevents the entangled particles from "communicating" any quantum information. A paradox is apparent and most people find it scientifically unaesthetic.
In reality, there is no paradox at all. All that has happened is that the pair of particles are in quantum superposition together: their observables are mutually governed by a single joint probability distribution. The measured observable states do not go from "randomized" to "correlated" as the measurement is made. The measurement only "samples" a single classical outcome governing both particles from the joint probability distribution that is actually there. The joint probability distribution was actually caused by the 100% local and slower-than-light interaction that entangled the two particles in the first place.
Likewise for Newcomb's Problem in decision theory. As the theorists of precommitment had intuited, the outcome is not actually caused when the CDT agent believes itself to be making a decision. Instead, the outcome was caused when Omega measured the agent and predicted its choice ahead of time: the state of the agent at this time causes both Omega's prediction and the agent's eventual action.
We thus develop an intuition that like a pair of particles, the two correlated decision processes behind Omega's prediction and behind the agent's "real" choice are in some sense entangled: correlated due to a causal interaction in their mutual past. All we then require to win at Newcomb's Problem is a rigorous conception of such entanglement and a way of handling it algorithmically to make regret-minimizing decisions when entangled.
Formalized decision entanglement
Let us begin by assuming that an agent can be defined as a function from a set of Beliefs and a Decision to an Action. There will not be very much actual proof-code given here, and what is given was written in the Coq proof assistant. The proofs, short though they be, were thus mechanically checked before being given here; "do try this at home, kids."
We can then broaden and redefine our definition of decision entanglement as saying, essentially, "Two agents are entangled when either one of them would do what the other is doing, were they to trade places and thus beliefs but face equivalent decisions." More simply, if a certain two agents are entangled over a certain two equivalent decisions, any differences in what decisions they actually make arise from differences in beliefs.
This kind of entanglement can then, quite quickly, be shown to be an equivalence relation, thus partitioning the set of all logical nodes in a causal graph into Yudkowsky's "groups where every node in a group was the result of the same calculation", with these groups being equivalence classes.
entangled a a d d.
intros.
constructor.
intros. reflexivity. reflexivity.
entangled a1 a2 d1 d2 ->
entangled a2 a1 d2 d1.
intros.
constructor;
induction H;
apply e. apply e0.
entangled a1 a2 d1 d2 ->
entangled a2 a3 d2 d3 ->
entangled a1 a3 d1 d3.
intros a1 a2 a3 d1 d2 d3 H12 H23.
constructor;
intros b. rewrite e. rewrite e1.
reflexivity. reflexivity.
Actually proving that this relation holds simply consists of proving that two agents given equivalent decisions will always decide upon the same action (similar to proving program equilibrium) no matter what set of arbitrary beliefs is given them -- hence the usage of a second-order forall. Proving this does not require actually running the decision function of either agent. Instead, it requires demonstrating that the abstract-syntax trees of the two decision functions can be made to unify, up to the renaming of universally-quantified variables. This is what allows us to prove the entanglement relation's symmetry and transitivity: our assumptions give us rewritings known to hold over the universally-quantified agent functions and decisions, thus letting us employ unification as a proof tool without knowing what specific functions we might be handling.
Thanks to employing the unification of syntax trees rather than the actual running of algorithms, we can conservatively extend Causal Decision Theory with logical nodes and entanglement to adequately handle timeless decision-making, without any recourse to retrocausality nor to the potentially-infinitely loops of Sicilian Reasoning. (Potential applications of timeless decision-making to win at Ro Sham Bo remain an open matter for the imagination.)
Decision-theoretically, since our relation doesn't have to know anything about the given functions other than (forall (b: Beliefs), a1 b d = a2 b d), we can test whether our relationship holds over any two logical/algorithm nodes in an arbitrary causal graph, since all such nodes can be written as functions from their causal inputs to their logical output. We thus do not need a particular conception of what constitutes an "agent" in order to make decisions rigorously: we only need to know what decision we are making, and where in a given causal graph we are making it. From there, we can use simple (though inefficient) pairwise testing to find the equivalence class of all logical nodes in the causal graph equivalent to our decision node, and then select a utility-maximizing output for each of those nodes using the logic of ordinary Causal Decision Theory.
The slogan of a Causal Decision Theory with Entanglement (CDT+E) can then be summed up as, "select the decision which maximizes utility for the equivalence class of nodes to which I belong, with all of us acting and exerting our causal effects in concert, across space and time (but subject to our respective belief structures)."
The performance of CDT with entanglement on common problems
While we have not yet actually programmed a software agent with a CDT+E decision algorithm over Bayesian causal graphs (any readers who can point us to a corpus of preexisting source code for building, testing, and reasoning about decision-theory algorithms will be much appreciated, as we can then replace this wordy section with a formal evaluation), we can provide informal but still somewhat rigorous explanations of what it should do on several popular problems and why.
First, the simplest case: when a CDT+E agent is placed into Newcomb's Problem, provided that the causal graph expresses the "agenty-ness" of whatever code Omega runs to predict our agent's actions, both versions of the agent (the "simulated" and the "real") will look at the causal graph they are given, detect their entanglement with each-other via pairwise checking and proof-searching (which may take large amounts of computational power), and subsequently restrict their decision-making to choose the best outcome over worlds where they both make the same decision. This will lead the CDT+E agent to take only the opaque box (one-boxing) and win $1,000,000. This is the same behavior for the same reasons as is obtained with Timeless Decision Theory, but with less human intervention in the reasoning process.
Provided that the CDT+E agent maintains some model of past events in its causal network, the Parfit’s Hitchhiker Problem trivially falls to the same reasoning as found in the original Newcomb’s Problem.
Furthermore, two CDT+E agents placed into the one-shot Prisoners' Dilemma and given knowledge of each-other's algorithms as embodied logical nodes in the two causal graphs will notice that they are entangled, choose the most preferable action over worlds in which both agents choose identically, and thus choose to cooperate. Should a CDT+E agent playing the one-shot Prisoner's Dilemma against an arbitrary agent with potentially non-identical code fail to prove entanglement with its opponent (fail to prove that its opponent's decisions mirror its own, up to differences in beliefs), it will refuse to trust its opponent and defect. A more optimal agent for the Prisoners' Dilemma would in fact demand from itself a proof that either it is or is not entangled with its opponent, and would be able to reason specifically about worlds in which the decisions made by two nodes cannot be the same. Doing so requires the Principle of the Excluded Middle, an axiom not normally used in the constructive logic of automated theorem-proving systems.
Lastly, different versions of CDT+E yield interestingly different results in the Counterfactual Mugging Problem. Let us assume that the causal graph given to the agent contains three logical nodes: the actual agent making its choice to pay Omega $100, Omega's prediction of what the agent will do in this case, and Omega's imagination of the agent receiving $1,000 had the coin come up the other way. The version of the entanglement relation here quantifies over decisions themselves at the first-order level, and thus the two versions of the agent who are dealing with the prospect of giving Omega $100 will become entangled. Despite being entangled, they will see no situation of any benefit to themselves, and will refuse to pay Omega the money. However, consider the stricter definition of entanglement given below:
This definition says that two agents are strongly entangled when they yield the same decisions for every possible pair of beliefs and decision problem that can be given to them. This continues to match our original intuition regarding decision entanglement: that we are dealing with the same algorithm (agent), with the same values, being instantiated at multiple locations in time and space. It is somewhat stronger than the reasoning behind Timeless Decision Theory: it can recognize two instantiations of the same agent that face two different decisions, and enable them to reason that they are entangled with each-other.
Under this stronger version of the entanglement relation (whose proofs for being an equivalence relation are somewhat simpler, by the way), a CDT+E agent given the Counterfactual Mugging will recognize itself as entangled not only with the predicted factual version of itself that might give Omega $100, but also with the predicted counterfactual version of itself that receives $1000 on the alternate coin flip. Each instance of the agent then independently computes the same appropriate tuple of output actions to maximize profit across the entire equivalence class (namely: predicted-factual gives $100, real-factual gives $100, predicted-counterfactual receives $1000).
Switching entirely to the stronger version of entanglement would cause a CDT+E agent to lose certain games requiring cooperation with other agents that are even trivially different (for instance, if one agent likes chocolate and the other hates it, they are not strongly entangled). These games remain winnable with the weaker, original form of entanglement.
Future research
Future research could represent the probabilistic possibility of entanglement within a causal graph by writing down multiple parallel logical/algorithm nodes as children of the same parent, each of which exists and acts with a probability conditional on the outcome of the parent node. A proof engine extended with probabilities over logical sentences (which, to the authors' knowledge, is not yet accomplished for second-order constructive logics of the kind used here) could also begin to assign probabilities to entanglement between logical/algorithm nodes. These probabilistic beliefs can then integrate into the action-selection algorithm of Causal Decision Theory just like any other probabilistic beliefs; the case of pure logic and pure proof from axioms merely constitutes assigning a degenerate probability of 1.0 to some belief.
Previous researchers have noted that decision-making over probabilistic acausal entanglement with other agents can be used to represent the notion of "universalizability" from Kantian deontological ethics. We note that entanglements with decision nodes in the past and future of a single given agent actually lead to behavior not unlike a "virtue ethics" (that is, the agent will start trying to enforce desirable properties up and down its own life history). When we begin to employ probabilities on entanglement, the Kantian and virtue-ethical strategies will become more or less decision-theoretically dominant based on the confidence with which CDT+E agents believe they are entangled with other agents or with their past and future selves.
Acausal trade/cooperation with agents other than the given CDT+E agent itself can also be considered, at least under the weaker definition of entanglement. In such cases, seemingly undesirable behaviors such as subjection to acausal versions of Pascal's Mugging could appear. However, entanglements (whether Boolean, constructive, or probabilistically believed-in) occur between logical/decision nodes in the causal graph, which are linked by edges denoting conditional probabilities. Each CDT+E agent will thus weight the other in accordance with their beliefs about the probability mass of causal link from one to the other, making acausal Muggings have the same impact on decision-making as normal ones.
The discovery that games can have different outcomes under different versions of entanglement leads us to believe that our current concept of entanglement between agents and decisions is incomplete. We believe it is possible to build a form of entanglement that will pay Omega in the Counterfactual Mugging without trivially losing at the Prisoners’ Dilemma (as strong entanglement can), but our current attempts to do so sacrifice the transitivity of entanglement. We do not yet know if there are any game-theoretic losses inherent in that sacrifice. Still, we hope that further development of the entanglement concept can lead to a decision theory that will more fully reflect the "timeless" decision-making intuition of retrospectively detecting rational precommitments and acting according to them in the present.
CDT+E opens up room for a fully formal and algorithmic treatment of the "timeless" decision-making processes proposed by Yudkowsky, including acausal "communication" (regarding symmetry or nonsymmetry) and acausal trade in general. However, like the original Timeless Decision Theory, it still does not actually have an algorithmic process for placing the logical/decision nodes into the causal graph -- only for dividing the set of all such nodes into equivalence classes based on decision entanglement. Were such an algorithmic process to be found, it could be used by an agent to locate itself within its model of the world via the stronger definition of entanglement. This could potentially reduce the problem of naturalizing induction to the subproblems of building a causal model that contains logical or algorithmic nodes, locating the node in the present model whose decisions are strongly entangled with those of the agent, and then proceeding to engage in "virtue ethical" planning for near-future probabilistically strongly-entangled versions of the agent's logical node up to the agent's planning horizon.
Acknowledgements
The authors would like to thank Joshua and Benjamin Fox for their enlightening lectures on Updateless Decision Theory, and to additionally thank Benjamin Fox in specific for his abundant knowledge, deep intuition and clear guidance regarding acausal decision-making methods that actually win. Both Benjamin Fox and David Steinberg have our thanks for initial reviewing and help clarifying the text.