Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Manfred 24 October 2014 07:03:09PM *  2 points [-]

still a big question what they argue

To be blunt, this is a question you can solve. Since it's a non-anthropic problem, though there is some danger in Beluga' analysis, vanilla UDT is all that's needed.

we still don't have evidence that the humans should follow them

The evidence goes as follows: The gnomes are in the same situation as the humans, with the same options and the same payoffs. Although they started with different information than the humans (especially since the humans didn't exist), at the time when they have to make the decision they have the same probabilities for payoffs given actions (although there's a deeper point here that could bear elaboration). Therefore the right decision for the gnome is also the right decision for the human.

This sounds an awful lot like an isomorphism argument to me... What sort of standard of evidence would you say is appropriate for an isomorphism argument?

Comment author: Stuart_Armstrong 24 October 2014 07:57:29PM 2 points [-]

I'm convinced that this issue goes much deeper than it first seemed... I'm putting stuff together, and I'll publish a post on it soon.

Comment author: lackofcheese 24 October 2014 02:38:41PM 2 points [-]

Yep, I think that's a good summary. UDT-like reasoning depends on the utility values of counterfactual worlds, not just real ones.

Comment author: Stuart_Armstrong 24 October 2014 03:02:47PM *  2 points [-]

I'm starting to think this is another version of the problem of personal identity... But I want to be thorough before posting anything more.

Comment author: lackofcheese 24 October 2014 11:16:01AM *  2 points [-]

I don't think that works, because 1) isn't actually satisfied. The selfish human in cell B is indifferent over worlds where that same human doesn't exist, but the gnome is not indifferent.

Consequently, I think that as one of the humans in your "closest human" case you shouldn't follow the gnome's advice, because the gnome's recommendation is being influenced by a priori possible worlds that you don't care about at all. This is the same reason a human with utility function T shouldn't follow the gnome recommendation of 4/5 from a gnome with utility function IT. Even though these recommendations are correct for the gnomes, they aren't correct for the humans.

As for the "same reasons" comment, I think that doesn't hold up either. The decisions in all of the cases are linked decisions, even in the simple case of U = S above. The difference in the S case is simply that the linked nature of the decision turns out to be irrelevant, because the other gnome's decision has no effect on the first gnome's utility. I would argue that the gnomes in all of the cases we've put forth have always had the "same reasons" in the sense that they've always been using the same decision algorithm, albeit with different utility functions.

Comment author: Stuart_Armstrong 24 October 2014 11:55:28AM 2 points [-]

Let's ditch the gnomes, they are contributing little to this argument.

My average ut=selfish argument was based on the fact that if you changed the utility of everyone who existed from one system to the other, then people's utilities would be the same, given that they existed.

The argument here is that if you changed the utility of everyone from one system to the other, then this would affect their counterfactual utility in the worlds where they don't exist.

That seems... interesting. I'll reflect further.

Comment author: lackofcheese 24 October 2014 11:16:01AM *  2 points [-]

I don't think that works, because 1) isn't actually satisfied. The selfish human in cell B is indifferent over worlds where that same human doesn't exist, but the gnome is not indifferent.

Consequently, I think that as one of the humans in your "closest human" case you shouldn't follow the gnome's advice, because the gnome's recommendation is being influenced by a priori possible worlds that you don't care about at all. This is the same reason a human with utility function T shouldn't follow the gnome recommendation of 4/5 from a gnome with utility function IT. Even though these recommendations are correct for the gnomes, they aren't correct for the humans.

As for the "same reasons" comment, I think that doesn't hold up either. The decisions in all of the cases are linked decisions, even in the simple case of U = S above. The difference in the S case is simply that the linked nature of the decision turns out to be irrelevant, because the other gnome's decision has no effect on the first gnome's utility. I would argue that the gnomes in all of the cases we've put forth have always had the "same reasons" in the sense that they've always been using the same decision algorithm, albeit with different utility functions.

Comment author: Stuart_Armstrong 24 October 2014 11:41:18AM 1 point [-]

I think I'm starting to see the argument...

Comment author: lackofcheese 24 October 2014 02:40:38AM *  3 points [-]

Having established the nature of the different utility functions, it's pretty simple to show how the gnomes relate to these. The first key point to make, though, is that there are actually two distinct types of submissive gnomes and it's important not to confuse the two. This is part of the reason for the confusion over Beluga's post.
Submissive gnome: I adopt the utility function of any human in my cell, but am completely indifferent otherwise.
Pre-emptively submissive gnome: I adopt the utility function of any human in my cell; if there is no human in my cell I adopt the utility function they would have had if they were here.

The two are different precisely in the key case that Stuart mentioned---the case where there is no human at all in the gnome's cell. Fortunately, the utility function of the human who will be in the gnome's cell (which we'll call "cell B") is entirely well-defined, because any existing human in the same cell will always end up with the same utility function. The "would have had" case for the pre-emptively submissive gnomes is a little stranger, but it still makes sense---the gnome's utility would correspond to the anti-indexical component JU of the human's utility function U (which, for selfish humans, is just zero). Thus we can actually remove all of the dangling references in the gnome's utility function, as per the discussion between Stuart and Beluga. If U is the utility function the human in cell B has (or would have), then the submissive gnome's utility function is IU (note the indexicalisation!) whereas the pre-emptively submissive gnome's utility function is simply U.

Following Beluga's post here, we can use these ideas to translate all of the various utility functions to make them completely objective and observer-independent, although some of them reference cell B specifically. If we refer to the second cell as "cell C", swapping between the two gnomes is equivalent to swapping B and C. For further simplification, we use $(B) to refer to the number of dollars in cell B, and o(B) as an indicator function for whether the cell has a human in it. The simplified utility functions are thus
T = $B + $C
A = ($B + $C) / (o(B) + o(C))
S = IS = $B
IT = o(B) ($B + $C)
IA = o(B) ($B + $C) / (o(B) + o(C))
Z = - $C
H = $B - $C
IH = o(B) ($B - $C)
Note that T and A are the only functions that are invariant under swapping B and C.

This invariance means that, for both cases involving utilitarian humans and pre-emptively submissive gnomes, all of the gnomes (including the one in an empty cell) and all of the humans have the same utility function over all possible worlds. Moreover, all of the decisions are obviously linked, and so there is effectively only one decision. Consequently, it's quite trivial to solve with UDT. Total utilitarianism gives
E[T] = 0.5(-x) + 2*0.5(1-x) = 1-1.5x
with breakeven at x = 2/3, and average utilitarianism gives
E[A] = 0.5(-x) + 0.5(1-x) = 0.5-x
with breakeven at x = 1/2.

In the selfish case, the gnome ends up with the same utility function whether it's pre-emptive or not, because IS = S. Also, there is no need to worry about decision linkage, and hence the decision problem is a trivial one. From the gnome's point of view, 1/4 of the time there will be no human in the cell, 1/2 of time there will be a human in the cell and the coin will have come up tails, and 1/4 of the time there will be a human in the cell and the coin will have come up heads. Thus
E[S] = 0.25(0) + 0.25(-x) + 0.5(1-x) = 0.5-0.75x
and the breakeven point is x = 2/3, as with the total utilitarian case.

In all of these cases so far, I think the humans quite clearly should follow the advice of the gnomes, because
1) Their utility functions coincide exactly over all a priori possible worlds.
2) The humans do not have any extra information that the gnomes do not.

Now, finally, let's go over the reasoning that leads to the so-called "incorrect" answers of 4/5 and 2/3 for total and average utilitarianism. We assume, as before, that the decisions are linked. As per Beluga's post, the argument goes like this:

With probability 2/3, the coin has shown tails. For an average utilitarian, the expected utility after paying x$ for a ticket is 1/3*(-x)+2/3*(1-x), while for a total utilitarian the expected utility is 1/3*(-x)+2/3*2*(1-x). Average and total utilitarians should thus pay up to 2/3$ and 4/5$, respectively.

So, what's the problem with this argument? In actual fact, for a submissive gnome, that advice is correct, but the human should not follow it. The problem is that a submissive gnome's utility function doesn't coincide with the utility function of the human over all possible worlds, because IT != T and IA != A. The key difference between the two cases is the gnome in the empty cell. If it's a submissive gnome, then it's completely indifferent to the plight of the humans; if it's a pre-emptively submissive gnome then it still cares.

If we were to do the full calculations for the submissive gnome, the gnome's utility function is IT for total utilitarian humans and IA for average utilitariam humans; since IIT = IT and IIA = IA the calculations are the same if the humans have indexical utility functions. For IT we get
E[IT] = 0.25(0) + 0.25(-x) + 2*0.5(1-x) = 1-1.25x
with breakeven at x = 4/5, and for IA we get
E[IA] = 0.25(0) + 0.25(-x) + 0.5(1-x) = 0.5-0.75x
with breakeven at x = 2/3. Thus the submissive gnome's 2/3 and 4/5 numbers are correct for the gnome, and indeed if the human's total/average utilitarianism is indexical they should just follow the advice, because their utility function would then be identical to the gnome's.

So, if this advice is correct for the submissive gnome, why should the pre-emptive submissive gnome's advice be different? After all, after conditioning on the presence of a human in the cell the two utility functions are the same. This particular issue is indeed exactly analogous to the mistaken "yea" answer in Psy-Kosh's non-anthropic problem. Although I side with UDT and/or the precommitment-based reasoning, I think that question warrants further discussion, so I'll leave that for a third comment.

Comment author: Stuart_Armstrong 24 October 2014 10:34:09AM *  1 point [-]

I like your analysis. Interestingly, the gnomes advise in the T and A cases for completely different reasons than in the S case.

But let me modify the case slightly: now the gnomes adopt the utility function of the closest human. This makes no difference to the T and A cases. But now in the S case, the gnomes have a linked decision, and

E[S] = 0.25(-x) + 0.25(-x) + 0.5(1-x) = 0.5-x

This also seems to satisfy "1) Their utility functions coincide exactly over all a priori possible worlds. 2) The humans do not have any extra information that the gnomes do not." Also, the gnomes are now deciding the T, A and S cases for the same reasons (linked decisions).

Comment author: Manfred 23 October 2014 11:07:41PM 1 point [-]

I feel like you are searching for disconfirmation and then stopping.

E.g.

One cannot add agents to an anthropic situation and expect the situation to be necessarily unchanged.

It's true, one can't always add agents. But there are some circumstances under which one can add agents, and it is important to continue on and identify how this can work. It turns out that you have to add gnomes and then add information that lets the humans know that they're humans and the gnomes know that they're gnomes. This works because even in anthropic situations, the only events that matter to your probability assignment are ones that are consistent with your information.

Comment author: Stuart_Armstrong 24 October 2014 09:29:12AM 1 point [-]

One cannot add agents to an anthropic situation and expect the situation to be necessarily unchanged.

The point of that is to allow me to analyse the problem without assuming the gnome example must be true. The real objections are in the subsequent points. Even if the gnomes argue something (still a big question what they argue), we still don't have evidence that the humans should follow them.

Comment author: ChristianKl 23 October 2014 09:38:08PM -2 points [-]

Politics is the mindkiller.

Can you find an example that's less political to make the same point?

Comment author: Stuart_Armstrong 24 October 2014 09:17:19AM *  0 points [-]

Can you find an example that's less political to make the same point?

Why? Are you arguing the example is wrong? Are you saying that you disagree with it personally? Because "don't talk about this general fact because someone else might think it has (weak) political implications" seems a heuristic to be avoided.

Comment author: Stuart_Armstrong 23 October 2014 12:31:03PM *  2 points [-]

To give a summary of my thoughts:

  • One cannot add agents to an anthropic situation and expect the situation to be necessarily unchanged. This includes agents that are not valued by anyone.
  • The submissive gnome problem initially gives x=$2/3 for selfish/average ut and x=$4/5 for total ut. This is wrong, but still has selfish and average ut at the same value.
  • A patch is added to change average ut and total ut but not selfish. This patch is essentially a pre-commitment. This patch is argued to not be available for selfish agents. This argument may be valid, but the reason the patch is not available needs to be made clear.
  • UDT attempts to reason without needing pre-commitment patches. Therefore even if the argument above is valid and patches can't be applied to gnomes of selfish agents, this does not preclude UDT from reaching a different answer from the above. UDT is compatible with pre-commitments, but that doesn't mean that UDT needs to be different if pre-commitments become impossible.
  • When discussing indexical vs non-indexical total utilitarianism, it seems to be argued that the first cannot be pre-commitment patched (stays at x=$4/5) while the second can (moves to x=$2/3).
  • It is not clear at all why this is the case, since the two utility functions are equal in every possible world, and, unlike the average ut=selfish situation, in impossible worlds as well (in the impossible worlds where identical agents reach different decisions). I see no valid reason that arguments available to one type of utility would not be available to the other.
  • In terms of worlds, these two utilities are just different ways of defining the same thing.
  • Without that difference between the indexical and non-indexical, the selfish=50%total ut+50%hater is valid.

Thus I don't think the argument works in its current form.

Comment author: owencb 23 October 2014 11:56:34AM 1 point [-]

This is nice in principle, although I'm not sure if there's much chance of finding enough events to observe a real effect.

It might be easier to detect a positive correlation on small events that goes missing on larger ones?

Comment author: Stuart_Armstrong 23 October 2014 12:03:46PM 0 points [-]

That could also work.

Comment author: Beluga 22 October 2014 07:53:33PM *  1 point [-]

The broader question is "does bringing in gnomes in this way leave the initial situation invariant"? And I don't think it does. The gnomes follow their own anthropic setup (though not their own preferences), and their advice seems to reflect this fact (consider what happens when the heads world has 1, 2 or 50 gnomes, while the tails world has 2).

As I wrote (after your comment) here, I think it is prima facie very plausible for a selfish agent to follow the gnome's advice if a) conditional on the agent existing, the gnome's utility function agrees with the agent's and b) conditional on the agent not existing, the gnome's utility function is a constant. (I didn't have condition b) explicitly in mind, but your example showed that it's necessary.) Having the number of gnomes depend upon the coin flip invalidates their purpose. The very point of the gnomes is that from their perspective, the problem is not "anthropic", but a decision problem that can be solved using UDT.

I also don't see your indexical objection. The sleeping beauty could perfectly have an indexical version of total utilitarianism ("I value my personal utility, plus that of the sleeping beauty in the other room, if they exist"). If you want to proceed further, you seem to have to argue that indexical total utilitarianism gives different decisions than standard total utilitarianism.

That's what I tried in the parent comment. To be clear, I did not mean "indexical total utilitarianism" to be a meaningful concept, but rather a wrong way of thinking, a trap one can fall into. Very roughly, it corresponds to thinking of total utilitarianism as "I care for myself plus any other people that might exist" instead of "I care for all people that exist". What's the difference, you ask? A minimal non-anthropic example that illustrates the difference would be very much like the incubator, but without people being created. Imagine 1000 total utilitarians with identical decision algorithms waiting in separate rooms. After the coin flip, either one or two of them are offered to buy a ticket that pays $1 after heads. When being asked, the agents can correctly perform a non-anthropic Bayesian update to conclude that the probability of tails is 2/3. An indexical total utilitarian reasons: "If the coin has shown tails, another agent will pay the same amount $x that I pay and win the same $1, while if the coin has shown heads, I'm the only one who pays $x. The expected utility of paying $x is thus 1/3 * (-x) + 2/3 * 2 * (1-x)." This leads to the incorrect conclusion that one should pay up to $4/5. The correct (UDT-) way to think about the problem is that after tails, one's decision algorithm is called twice. There's only one factor of 2, not two of them. This is all very similar to this post.

To put this again into context: You argued that selfishness is a 50/50 mixture of hating the other person, if another person exists, and total utilitarianism. My reply was that this is only true if one understands total utilitarianism in the incorrect, indexical way. I formalized this as follows: Let the utility function of a hater be vh - h * vo (here, vh is the agent's own utility, vo the other person's utility, and h is 1 if the other person exists and 0 otherwise). Selfishness would be a 50/50 mixture of hating and total utilitarianism if the utility function of a total utilitarian were vh + h * vo. However, this is exactly the wrong way of formalizing total utilitarianism. It leads, again, to the conclusion that a total utilitarian should pay up to $4/5.

Comment author: Stuart_Armstrong 23 October 2014 11:40:41AM *  1 point [-]

A minimal non-anthropic example that illustrates the difference

The decision you describe in not stable under pre-commitments. Ahead of time, all agents would pre-commit to the $2/3. Yet they seem to change their mind when presented with the decision. You seem to be double counting, using the Bayesian updating once and the fact that their own decision is responsible for the other agent's decision as well.

In the terminology of paper http://www.fhi.ox.ac.uk/anthropics-why-probability-isnt-enough.pdf , your agents are altruists using linked decisions with total responsibility and no precommitments, which is a foolish thing to do. If they were altruists using linked decisions with divided responsibility (or if they used precommitments), everything would be fine (I don't like or use that old terminology - UDT does it better - but it seems relevant here).

But that's detracting from the main point: still don't see any difference between indexical and non-indexical total utilitarianism. I don't see why a non-indexical total utilitarian can't follow the wrong reasoning you used in your example just as well as an indexical one, if either of them can - and similarly for the right reasoning.

View more: Next