What makes counterfactuals comparable?

Chris_Leong

LESSWRONG
LW

What makes counterfactuals comparable? — LessWrong

11 What makes counterfactuals comparable?

by Chris_Leong

24th Apr 2020

AI Alignment Forum

4 min read

11 Ω 4

l was attempting to write a reference post on the concept of comparability in decision theory problems, but I realised that I don't yet have a strong enough grasp on the various positions that one could adopt to write a post worthy of being a reference. I'll quote my draft quite liberally below:

In the context of decision theory, comparability is about whether or not it is fair to compare counterfactuals when evaluating decisions, a decision algorithm or decision theory. Perhaps the best way to illustrate is with the example of a medical trial. Let's suppose we're trying to see if aspirin reduces the amount of pain experienced. So if we create two groups, give aspirin to one and then observe that group as having experienced less pain, that is evidence that it does what we want. However, if the aspirin group was healthy and most of other group had cancer, then this wouldn't be a fair test. We would be treating two groups as comparable when they differed in an attribute relevant the to outcome we cared about.

Given a decision problem, we normally apply a decision theory to construct counterfactuals, then calculate the utility for each and finally make a decision. Prima facie, it appears that these counterfactuals must be comparable in order for a decision theory to count as being reasonable. Otherwise, we would open ourselves to the critique of being the same as a naive researcher in the aspirin example.

We can clarify this with an example. Casual decision theorists recommend 2-boxing for Newcomb's Problem. They admit that a 1-boxer will receive an extra $1 million, but they would likely argue that this isn't a fair comparison since the opaque box contains $1 million for the 1-boxer, but not for the 2-boxer. This is essentially a dispute over whether these counterfactuals are comparable.

Why is this important?

Well, as far as I can tell attempts to understand counterfactuals have taken us to logical counterfactuals at which point we've become stuck. Asking an easier question could help us to become unstuck. And determining if counterfactuals are comparable seems easier than saying what counterfactuals are. Indeed, I would go so far as to say that if we don't know what we mean by comparability then we don't fully even know what we are looking for.

Given this, I find it strange that this notion hasn't been discussed to any significant degree at all on Less Wrong as far as I can tell, although I haven't performed an in-depth search.

But before we go any further, it's worth asking what objections could be made to this approach. I'll quote my draft again:

Firstly, comparability only makes sense if the notion of counterfactuals makes sense. If they don't exist, then we would have to abandon the quest.

Secondly, we could admit counterfactuals, but deny comparability. Why might we do this? Firstly, the requirement for the past to be comparable regardless of our action seems to assume that our action shouldn't affect the past. But if there wasn't any fundamental difference between forwards and backwards causation, this assumption would seem unsupported. Secondly, we might think that only the partial information we have before we are told our action must be the same and that there is no requirement for the counterfactuals to actually be comparable. Evidential decision theory could be justified on these terms.

Thirdly, we could argue for a notion of comparability that can be trivially satisfied. Causal decision theory leaves the past unchanged and only intervenes at the point of the decision. So these kind of counterfactuals are always trivially comparable, as it seems reasonable to presume that comparability only depends on the past and that identical pasts are automatically comparable. Note that it might be possible to argue for causal decision theory within the comparability framework. If only exact pasts were comparable, then that'd exclude almost every theory except for CDT.

Fourthly, we could argue that there are many different notions of comparability, so the question, "What does it mean for counterfactuals to be comparable?" is meaningless without further information about the purpose we are asking.

Three questions:

I'll finish this post with three questions designed to help clarify the notion of comparability. If you have time, I'd really appreciate it if you thought about the questions before writing your own answers, as that'd likely increase the diversity of responses.

1) Suppose you have the option to choose one of two boxes: the first containing an item worth 5 utility and the second containing an item worth 10 utility. This results in one counterfactual where you take the first box and receive 5 utility and another where you take the second and receive 10 utility. Almost no-one would dispute that these counterfactuals are comparable, but why?

2) As discussed above, a casual decision theorist would likely argue that the counterfactuals constructed by a timeless decision theorist aren't comparable because the 1-boxer has $1 million in the mystery box, while the two-boxer's box is empty. Most people on Less Wrong think that the casual decision theorist is wrong. How can we respond to this claim? Does it satisfy another notion of comparability or does this show the notion of comparability is irrelevant?

3) An evidential decision theorist wouldn't smoke in the Smoking Lesion problem so they don't get cancer. Most people argue that they are incorrect because when evaluating smoking we can't compare a group of people predisposed to cancer to a normal group of people. Is this correct? And if so, wouldn't this mean that a 1-boxer would be correct in rejecting timeless decision theory counterfactuals as non-comparable (contrary to LW wisdom)? (There have been criticisms of the Smoking Lesion problem, but I think we could just make the same argument with Counterfactual Blackmail instead).

This post was supported by the AI Safety Research Program and was influenced by discussion with Davide Zagami and Pablo Moreno although the opinions expressed here are my own. It is an extension of work performed at the EA Hotel.