For illustrative purposes, imagine simple agents - AI's, or standard utility maximisers - who have to make decisions under anthropic uncertainty.

Specifically, let there be two worlds, W1 and W2, equally likely to exist. W1 contains one copy of the agent, W2 contains two copies. The agent has one single action available: the opportunity to create, once, either a box or a cross. The utility of doing so varies depending on which world the agent is in, as follows:

In W1: Utility(cross) = 2, Utility(box) = 5

In W2: Utility(cross) = 2, Utility(box) = 0

The agent has no extra way of telling which world they are in.

  • First model (aggregationist, non-indexical):

Each box or cross created will generate the utility defined above, and the utility is simply additive. Then if the agent decides to generate crosses, the expected utility is 0.5(2+(2+2))=3, while that of generating boxes is 0.5(5+(0+0))=2.5. Generating crosses is the way to go.

  • Second model (non-aggregationist, non-indexical):

The existence of a single box or cross will generate the utility defined above, but extra copies won't change anything. Then if the agent decides to generate crosses, the expected utility is 0.5(2+2)=2, while that of generating boxes is 0.5(5+0)=2.5. Generating boxes is the way to go.

  • Third model (unlikely existence, non-aggregationist, non-indexical):

Here a simple change is made: the worlds do not contain agents, but proto-agents, each of which has an (independent) one chance in a million of becoming an agent. Hence the probability of the agent existing in the first universe is 1/M, while the probability of an agent existing in the second universe is approximately 2/M. The expected utility of crosses is approximately 1/M*0.5(2+2*2)=3/M while that of boxes is approximately 1/M*0.5(5+2*0)=2.5/M. Generating crosses is the way to go.

  • Fourth model (indexical):

This is the first "hard" model from the title. Here the agent only derives utility from the box or cross it generated itself. And here, things get interesting.

There is no immediately obvious way of solving this situation, so I tried replacing it with a model that seems equivalent. Instead of having indexical preferences for its own shapes, I'll give the agent non-indexical aggregationist preferences (just as in the first model), and halve the utility of any shape in W2. This should give the same utility to all agents in all possible worlds as the indexical model. Under the new model, the utility of crosses is 0.5(2+(1+1)) = 2, while that of boxes is 0.5(5+(0+0))=2.5. Boxes are the way to go.

  • Fifth model (indexical, anticipated experience)

The fifth model is one where after the agents in W2 have made their decision, but before they implement it, one of them is randomly deleted, and the survivor creates two shapes. If the agents are non-indexical, then the problem is simply a version of first model.

But now the agents are indexical. There are two ways of capturing this fact; either the agent can care about the fact that "I, myself will have created a shape", or "the thread of my future experience will contain an agent that will have created a shape". In the first case, the agent should consider that in W2, it only has a 50% chance of succeeding in it's goal, but the weight of its goal is doubled: this is the fourth model again, hence: boxes.

In the second case, each agent should consider the surviving agent as the thread of its future experience. This is equivalent to non-indexical first case, where only the number of shapes matter (since all future shapes belong to an agent that is in the current agent(s)' future thread of experience). Hence: crosses.

 

I won't be analysing solutions to these problems yet, but simply say that many solutions will work, such as SIA with a dictator's filter. However, though the calculations are correct, the intuition behind this seems suspect in the fourth model, and one could achieve similar results without SIA at all (giving its decision the power to affect multiple agent outcomes at once, for instance).

It should be noted that the fourth model seems to imply the Presumptuous Philosopher would be wrong to accept his bets. However, the third model seems to imply the truth of FNC (full non-indexical conditioning), which is very close to SIA - but time inconsistent. And there, the Presumptuous Philosopher would be right to accept his bets.

Confusion still persists in my mind, but I think it's moving towards a resolution.

New Comment


21 comments, sorted by Click to highlight new comments since:

I think the standard term for what you call "anthropic uncertainty" is "indexical uncertainty" (which, as far as I can tell, was first coined by Nick Bostrom in his 2000 PhD thesis).

Also, I suggest that you say a bit more about the context and motivation for this post. I interpret what you wrote as an outline of some of the ambiguities/choices a human or Bayesian AI would face if they tried to convert their current preferences into UDT preferences. But I'm not sure if that's what you intended.

Apologies; the post grew out of some of the anthropics discussions at the FHI. The idea was mainly to set down a few worlds where we know the right answer (arguably) and to see what that implies about anthropics. I see now that the heart and soul of the piece are the arguments presented in defence of the models 1 and 4; I'll expand those to a better post, with context, later.

I don't see any of your agents say something like "I have P=1/3 of being in world 1, because there are 2 situations fitting my information in world 2 and 1 situation fitting my information in world 1." How would you classify a model that used that sort of anthropic reasoning?

I haven't classified that, because I haven't used the anthropic reasoning. I was just trying to figure out what, from a timeless perspective, the "correct" decision must be.

Then afterwards I'll think of methods of reaching that decision. Though it seems that using your probablity estimate (known as SIA) and a "division of responsability" method, we get the answers presented here.

Well, you haven't classified it yet :D But it seems like it would be type 4. Yet it would produce different results than type four (crosses), for the obvious reason that it estimates the probabilities out in front differently. This is notably better by the measure that if all agents built crosses, it would be better overall than if all agents built boxes, so the expected utility per agent is larger. Basically, types 1 and 4 become unified if the agent can't tell between its different possible circumstances, because 1 maximizes utility of all agents and 4 maximizes expected utility of that agent, which is just (#1)/N.

I thought about it a little bit and here's a formula that seems to give the same answers as you on the first four models:

argmax over all D in decisions:
    sum over all W in worlds:
        probability(W)*utility(W,D)

where probability(W) is according to the prior (no updating on your own existence), and utility(W,D) is the utility received by each agent in world W if all agents make decision D. It also seems to give the right answer in Psy-Kosh's non-anthropic problem, so maybe we're on to something.

Also your post seems related to your earlier problem about agents forgetting their utility functions, and my formula coincides with my proposed solution to that problem. Now that I think about it, it's unsurprising that anthropics/indexicals are related to uncertainty about utility functions.

Yes, this is another way of going about it (I said there were more than one way of doing it :-)

The other case that I haven't yet considered is when you start with two identical copies, that then get different inputs (so moving more towards a situation like in model 3). And if you don't use SIA, then your probabilities will be time-inconsistent (though you will still get the right answer!). That's why I tend to prefer SIA+division of responsibility, because the probabilities of being in a world doesn't change according to irrelevant information.

...how is what you wrote different from standard expected utility maximization?

Also, probability might be controlled by the agent as well, so use probability(W,D), and selecting individual worlds might not be a good idea (instead, sum over a partition into sufficiently uniform events).

...how is what you wrote different from standard expected utility maximization?

My formula deals with multiple agents having different utility functions.

Also, probability might be controlled by the agent as well, so use probability(W,D)

That simple? It's not immediately clear to me. Could you give an example? For some reason I thought that the formula should become even more complicated in such cases.

My formula deals with multiple agents having different utility functions.

As I discussed on the decision theory list, this problem is not well-defined. All decisions must be performed in the service of a particular decision problem. You can't be uncertain about which decision problem you're solving, but the decision problem that you're solving can be logically opaque, so that you have logical uncertainty about its elements. In particular, you can have utility symbol that's defined as [U=if Q U1 else U2] where Q is a complicated statement. This doesn't extend to multiple agents, where you have to analyze the problem from one particular agent's standpoint, although analysis of a game-theoretic situation might yield similar steps.

You can control the probability of current observational situation e.g. in transparent Newcomb's. (You can also easily define events whose probability you control, if there's no requirement that such events are of any relevance to the problem, by including possible worlds in the events conditionally on a logical statement that you control.)

As I discussed on the decision theory list, this problem is not well-defined.

Does this mean you disagree with Stuart's solution to the fourth model (which involves multiple agents with different utility functions)? Can you point out the mistake? My formula was just a formalization of Stuart's idea, I may be missing something obvious here...

I don't understand what "itself" means in "Here the agent only derives utility from the box or cross it generated itself.", given that we have a world with two identical agents, which is better described as a world with one agent controlling it through two control sites (through a dependence that acts on both sites). I think it's a bad idea to discuss virtues of methods of solving a problem whose statement isn't clear.

Well, you're wrong. The problem statement is completely clear and can be implemented with something like cellular automata: two hungry agents in two separate rooms within a bigger world, each one choosing to generate a snack that it will then eat. Logical dependence between their decisions must be inferred in the usual way, it's not part of the problem specification. If your formalism says such problems are not well-defined, whoops, too bad for the formalism! (Didn't you also give up your ability to meaningfully talk about the PD, by the way?)

(Didn't you also give up your ability to meaningfully talk about the PD, by the way?)

No, in PD each agent knows where it is and what utility value it gets. In PD with identical players the problem might be similar if it's postulated that the two agents get different utilities.

Identical agents can't get different utilities in the internal sense of what's referred by their decision problems (and any other sense is decision-theoretically irrelevant, since the agent can't work with what it can't work with), because definition of utility is part of the decision problem, which in turn is part of the agent (or even whole of the agent).

When you're playing a lottery, you're deciding based on utility of the lottery, not on utility of inaccessible (and in this sense, meaningless to the agent) "actual outcome". Utility of the unknown outcome is not what plays the role of utility in agent's decision problem, hence we have a case of equivocation.

Well, you're wrong. The problem statement is completely clear and can be implemented with something like cellular automata

I understand the environment specified in the problem statement, but not the decision problem.

If your formalism says such problems are not well-defined, whoops, too bad for the formalism!

Well, maybe, but the statement that everything is totally clear doesn't help me understand better. I can intuitively guess what is intended, but that's different from actually seeing all the pieces of the puzzle.

Edit: Well, I guess I should indicate situations where I "don't understand" in the sense of not understanding to my satisfaction, as opposed to pretending to not understand what doesn't fit my models or because the questions is expected to be confusing to most readers. Sometimes I'm confused not because of an apparent property of a question, but because I'm trying to solve some obscure aspect of it that isn't on everyone's mind.

I really like your proposed solution to the fourth model. Have you tried generalizing it to other problems?

Not yet; still buzy formalising it.

Or, technically, still buzy taking a break from two weeks of intense FHI-based thinking, and returning to it later :-)

[-][anonymous]00

Suggestion: Change the name from World W1 and World W2 to something else, too similar to World Wars 1 and 2.

Unless that's what is inteded.