Stuart_Armstrong comments on Papers framing anthropic questions as decision problems? - Less Wrong

3 Post author: jsalvatier 26 April 2012 12:40AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (36)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 26 April 2012 10:01:05AM 3 points [-]

Many thanks for your comments; it's nice to have someone engaging with it.

That said, I have to disagree with you! You're right, the whole point was to avoid using "anthropic probabilities" (though not "subjective probabilities" in general; I may have misused the word "objective" in that context). But the terms "altruistic", "selfish" and so on, do correspond to actual utility functions.

"Selfless" means your utility function has no hedonistic content, just an arbitrary utility function over world states that doesn't care about your own identity. "Altruistic" means your utility function is composed of some amalgam of the hedonistic utilities of you and others. And "selfish" means your utility function is equal to your own hedonistic utility.

The full picture is more complex - as always - and in some contexts it would be best to say "non-indexical" for selfless and "indexical" for selfish. But be as it may, these are honest, actual utility functions that you are trying to maximise, not "A"'s over utility functions. Some might be "A"'s over hedonistic utility functions, but they still are genuine utilities: I am only altruistic if I actually want other people to be happy (or achieve their goals or similar); their happiness is a term in my utility function.

Then ADT can be described (for the non-selfish agents) coloquially as "before the universe was created, if i wanted to maximise U, what decision theory would I want any U-maximiser to follow? (ie what decision theory maximises the expected U in this context)". So ADT is doubly a utility maximising theory: first pick the utility maximising decision theory, then the agents with it will try and maximise utility in accordance with the theory they have.

(for selfish/indexical agents it's a bit more tricky and I have to use symmetry or "veil of ignorance" arguments; we can get back to that).

Furthermore, ADT can perfectly deal with any other type of uncertainty - such as whether you are or aren't in a Sleeping Beauty problem, or when you have partial evidence that you're Monday, or whatever. There's no need for it to restrict to the simple cases. Admittedly for the presumptuous philosopher, I restricted to a simple, with simple binary altruistic/selfish utilites, but that was for illustrative purposes. Come up with a more complex problem, with more complicated utilties, and ADT will give a correspondingly more complex answer.

Comment author: Manfred 26 April 2012 11:06:35AM *  1 point [-]

Okay, let's look at the "selfish" anthropic preference laid out in your paper, in two different problems.

In both of these problems there are two worlds, "H" and "T," which have equal "no anthropics" probabilities of 0.5. There are two people you could be in T and one person you could be in H. Standard Sleeping Beauty so far.

However, because I like comparing things to utility, I'm going to specify two sets of probabilities. In Problem 1, the probability of being each person is 1/3. In Problem 2, the probability of being the person in H is 1/2, while the probability of being a person in T is 1/4 each.

( These probabilities can be manipulated by giving people evidence of which world they are in - for example, you could spontaneously stop some H or T sessions of the experiment, so the people in the experiment can condition on the experiment not being stopped. The point is that both problems are entirely possible. )

And let's have winning each bet give 1 hedonic utilon (you get to eat a candybar).

ADT makes identical choices in both problems, because it just interacts with the "if there were no anthropics" probabilities. The selfish preference just says to give each world the average utility of all its inhabitants. To calculate the expected-utility-ish thing for betting on Tails, we take the average bet won in the winning side (1 candybar) and multiply it by the probability of the winning side if there were no anthropics (0.5). So our "selfish" ADT agent pays up to 0.5 candybars for a bet on Tails in both problems.

Now, what are the expected hedonic utilities? (in units of candybars, of course)

In problem 1, the probability of being a winner is 2/3, so a utility maximizer pays up to 2/3 of a candybar to bet on Tails.

In problem 2, the probability of being a winner is 1/2, so a utility maximizer pays up to 1/2 of a candybar to bet on Tails.

So in problem 2, the "selfish" ADT agent and the utility maximizer do the same thing. This looks like a good example of selfishness. But what's going on in problem 1? Even though the utilon is entirely candybar-based, the "selfish" anthropic preference seems to undervalue it. What anthropic preference would maximize expected hedonic utility?

Well, if you added up all the utilities in each world, rather than averaging, then an ADT agent would do the same thing as a utility maximizer in problem 1. But now in problem 2, this "total utility" ADT agent would overestimate the value of a candybar, relative to maximum expected utility.

There are in fact no ADT preferences that maximize candybars in both problems. There is no analogue of utility maximization, which makes sense because ADT doesn't deal with the subjective probabilities and expected utility does.

Comment author: Stuart_Armstrong 26 April 2012 12:49:49PM *  1 point [-]

However, because I like comparing things to utility, I'm going to specify two sets of probabilities. In Problem 1, the probability of being each person is 1/3. In Problem 2, the probability of being the person in H is 1/2, while the probability of being a person in T is 1/4 each.

To translate this into ADT terms: in problem 2, the coin is fair, in problem 1, the coin is (1/3, 2/3) on (H, T) (or maybe the coin was fair, but we got extra info that pushed the postiori odds to (1/3, 2/3)).

Then ADT (and SSA) says that selfish agents should bet up to 2/3 of candybar on Tails in problem 1, and 1/2 in problem 2. Exactly the same as what you were saying. I don't understand why you think that ADT would make identical choices in both problems.

Comment author: Manfred 26 April 2012 04:59:03PM 0 points [-]

Exactly the same as what you were saying

The reason that's "exactly as I was saying" is because you adjusted a free parameter to fit the problem, after you learned the subjective probabilities. The free parameter was which world to regard as "normal" and which one to apply a correction to. If you already know that the (1/2, 1/4, 1/4) problem is the "normal" one, then you already solved the probability problem and should just maximize expected utility.

Comment author: Stuart_Armstrong 26 April 2012 05:31:14PM *  0 points [-]

Er no - you gave me an underspecified problem. You told me the agents were selfish (good), but then just gave me anthropic probabilities, without giving me the non-anthropic probabilities. I assumed you were meaning to use SSA, and worked back from there. This may have been incorrect - were you assuming SIA? In that case the coin odds are (1/2,1/2) and (2/3,1/3), and ADT would reach different conclusions. But only because the problem was underspecified (giving anthropic probabilities without explaining the theory that goes with them is not specifying the problem).

As long as you give a full specification of the problem, ADT doesn't have an issue. You don't need to adjust free parameters or anything.

I feel like I'm missing something here. Can you explain the hole in ADT you seem to find so glaring?

Comment author: Manfred 26 April 2012 06:21:56PM *  0 points [-]

You told me the agents were selfish (good), but then just gave me anthropic probabilities, without giving me the non-anthropic probabilities.

I intended "In both of these problems there are two worlds, "H" and "T," which have equal "no anthropics" probabilities of 0.5. "

In retrospect, my example of evidence (stopping some of the experiments) wasn't actually what I wanted, since an outside observer would notice it. In order to mess with anthropic probabilities in isolation you'd need to change the structure of coinflips and people-creation.

Comment author: Stuart_Armstrong 27 April 2012 12:13:55PM 0 points [-]

In order to mess with anthropic probabilities in isolation you'd need to change the structure of coinflips and people-creation

But you can't mess with the probabilities in isolation. Suppose I were an SIA agent, for instance; then you can't change my anthropic probabilities without changing non-anthropic facts about the world.

Comment author: Manfred 27 April 2012 02:19:01PM 0 points [-]

I'm uncertain whether what you're saying is relevant. The question at hand is, is there some change to a problem that changes anthropic probabilities, but is guaranteed not to change ADT decisions? Such a change would have to conserve the number of worlds, the number of people in each world, the possible utilities, and the "no anthropics" probabilities

For example, if my anthropic knowledge says that I'm an agent at a specific point in time, a change in how long Sleeping Beauty stays awake in different "worlds" will change how likely I am to find myself there overall.

Comment author: Stuart_Armstrong 27 April 2012 04:31:09PM *  0 points [-]

The question at hand is, is there some change to a problem that changes anthropic probabilities, but is guaranteed not to change ADT decisions?

Is there? It would require some sort of evidence that would change your own anthropic probabilities, but that would not change the opinion of any outside observer if they saw it.

For example, if my anthropic knowledge says that I'm an agent at a specific point in time, a change in how long Sleeping Beauty stays awake in different "worlds" will change how likely I am to find myself there overall.

Doesn't feel like that would work... if you remember how long you've been awake, that makes you into slightly different agents, and if the duration of the awakening gives you any extra info, it would show up in ADT too. And if you forget how long you're awake, that's just sleeping beauty with more awakenings...

Define "individual impact" as the belief that your own actions have no correlations with those of your copies (the belief your decisions control all your copies is "total impact"). Then ADT basically has the following equivalences:

  • ADT + selfless or total utilitarian = SIA + individual impact (= SSA+ total impact)
  • ADT + average utilitarian = SSA + individual impact
  • ADT + selfish = SSA + individual impact + complications (e.g. with precommitments)

If those equivalences are true, it seems that we cannot vary the anthropic probabilities without varying the ADT decision.

Comment author: Manfred 28 April 2012 09:25:24PM *  0 points [-]

EDIT: Expanded first point a bit.

if you remember how long you've been awake, that makes you into slightly different agents, and if the duration of the awakening gives you any extra info, it would show up in ADT too.

Hm. One could try and fix it by splitting each point in time into different "worlds," like you suggest below. But the updating from time (let's assume there's no clock to look at, so the curves are smooth) would rely on the subjective probabilities, which you are avoiding. The update ratio is P(feels like 4 hours | heads) / P(feels like 4 hours). If P(feels like 4 hours | X) is 0.9 if X is heads and 0.8 if X is tails, then if the probabilities are 1/3 the ratio will be 1.08, while if the probabilities are 1/2, 1/4, 1/4 the update is a factor of 1.059.

This does lead to a case a bit more complicated than my original examples, though, because the people in different worlds will make different decisions. I'm not even sure how ADT would handle this situation, since it has to avoid the subjective probabilities - do you respond like an outside observer, and use 0.5, 0.5 for everything?

And if you forget how long you're awake, that's just sleeping beauty with more awakenings...

Yes, that would be reasonable.

Then ADT basically has the following equivalences:

Those only hold if things are simple. To say "these might prevent things from getting any more complicated" is to put the cart before the horse.