Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Outlawing Anthropics: An Updateless Dilemma

26 Post author: Eliezer_Yudkowsky 08 September 2009 06:31PM

Let us start with a (non-quantum) logical coinflip - say, look at the heretofore-unknown-to-us-personally 256th binary digit of pi, where the choice of binary digit is itself intended not to be random.

If the result of this logical coinflip is 1 (aka "heads"), we'll create 18 of you in green rooms and 2 of you in red rooms, and if the result is "tails" (0), we'll create 2 of you in green rooms and 18 of you in red rooms.

After going to sleep at the start of the experiment, you wake up in a green room.

With what degree of credence do you believe - what is your posterior probability - that the logical coin came up "heads"?

There are exactly two tenable answers that I can see, "50%" and "90%".

Suppose you reply 90%.

And suppose you also happen to be "altruistic" enough to care about what happens to all the copies of yourself.  (If your current system cares about yourself and your future, but doesn't care about very similar xerox-siblings, then you will tend to self-modify to have future copies of yourself care about each other, as this maximizes your expectation of pleasant experience over future selves.)

Then I attempt to force a reflective inconsistency in your decision system, as follows:

I inform you that, after I look at the unknown binary digit of pi, I will ask all the copies of you in green rooms whether to pay $1 to every version of you in a green room and steal $3 from every version of you in a red room.  If they all reply "Yes", I will do so.

(It will be understood, of course, that $1 represents 1 utilon, with actual monetary amounts rescaled as necessary to make this happen.  Very little rescaling should be necessary.)

(Timeless decision agents reply as if controlling all similar decision processes, including all copies of themselves.  Classical causal decision agents, to reply "Yes" as a group, will need to somehow work out that other copies of themselves reply "Yes", and then reply "Yes" themselves.  We can try to help out the causal decision agents on their coordination problem by supplying rules such as "If conflicting answers are delivered, everyone loses $50".  If causal decision agents can win on the problem "If everyone says 'Yes' you all get $10, if everyone says 'No' you all lose $5, if there are conflicting answers you all lose $50" then they can presumably handle this.  If not, then ultimately, I decline to be responsible for the stupidity of causal decision agents.)

Suppose that you wake up in a green room.  You reason, "With 90% probability, there are 18 of me in green rooms and 2 of me in red rooms; with 10% probability, there are 2 of me in green rooms and 18 of me in red rooms.  Since I'm altruistic enough to at least care about my xerox-siblings, I calculate the expected utility of replying 'Yes' as (90% * ((18 * +$1) + (2 * -$3))) + (10% * ((18 * -$3) + (2 * +$1))) = +$5.60."  You reply yes.

However, before the experiment, you calculate the general utility of the conditional strategy "Reply 'Yes' to the question if you wake up in a green room" as (50% * ((18 * +$1) + (2 * -$3))) + (50% * ((18 * -$3) + (2 * +$1))) = -$20.  You want your future selves to reply 'No' under these conditions.

This is a dynamic inconsistency - different answers at different times - which argues that decision systems which update on anthropic evidence will self-modify not to update probabilities on anthropic evidence.

I originally thought, on first formulating this problem, that it had to do with double-counting the utilons gained by your variable numbers of green friends, and the probability of being one of your green friends.

However, the problem also works if we care about paperclips.  No selfishness, no altruism, just paperclips.

Let the dilemma be, "I will ask all people who wake up in green rooms if they are willing to take the bet 'Create 1 paperclip if the logical coinflip came up heads, destroy 3 paperclips if the logical coinflip came up tails'.  (Should they disagree on their answers, I will destroy 5 paperclips.)"  Then a paperclip maximizer, before the experiment, wants the paperclip maximizers who wake up in green rooms to refuse the bet.  But a conscious paperclip maximizer who updates on anthropic evidence, who wakes up in a green room, will want to take the bet, with expected utility ((90% * +1 paperclip) + (10% * -3 paperclips)) = +0.6 paperclips.

This argues that, in general, decision systems - whether they start out selfish, or start out caring about paperclips - will not want their future versions to update on anthropic "evidence".

Well, that's not too disturbing, is it?  I mean, the whole anthropic thing seemed very confused to begin with - full of notions about "consciousness" and "reality" and "identity" and "reference classes" and other poorly defined terms.  Just throw out anthropic reasoning, and you won't have to bother.

When I explained this problem to Marcello, he said, "Well, we don't want to build conscious AIs, so of course we don't want them to use anthropic reasoning", which is a fascinating sort of reply.  And I responded, "But when you have a problem this confusing, and you find yourself wanting to build an AI that just doesn't use anthropic reasoning to begin with, maybe that implies that the correct resolution involves us not using anthropic reasoning either."

So we can just throw out anthropic reasoning, and relax, and conclude that we are Boltzmann brains.  QED.

In general, I find the sort of argument given here - that a certain type of decision system is not reflectively consistent - to be pretty damned compelling.  But I also find the Boltzmann conclusion to be, ahem, more than ordinarily unpalatable.

In personal conversation, Nick Bostrom suggested that a division-of-responsibility principle might cancel out the anthropic update - i.e., the paperclip maximizer would have to reason, "If the logical coin came up heads then I am 1/18th responsible for adding +1 paperclip, if the logical coin came up tails then I am 1/2 responsible for destroying 3 paperclips."  I confess that my initial reaction to this suggestion was "Ewwww", but I'm not exactly comfortable concluding I'm a Boltzmann brain, either.

EDIT:  On further reflection, I also wouldn't want to build an AI that concluded it was a Boltzmann brain!  Is there a form of inference which rejects this conclusion without relying on any reasoning about subjectivity?

EDIT2:  Psy-Kosh has converted this into a non-anthropic problem!

Comments (194)

Comment author: Psy-Kosh 09 September 2009 03:37:59PM *  36 points [-]

Actually... how is this an anthropic situation AT ALL?

I mean, wouldn't it be equivalent to, say, gather 20 rational people (That understand PD, etc etc etc, and can certainly manage to agree to coordinate with each other) that are allowed to meet with each other in advance and discuss the situation...

I show up and tell them that I have two buckets of marbles, some of which are green, some of which are red

One bucket has 18 green and 2 red, and the other bucket has 18 red and 2 green.

I will (already have) flipped a logical coin. Depending on the outcome, I will use either one bucket or the other.

After having an opportunity to discuss strategy, they will be allowed to reach into the bucket without looking, pull out a marble, look at it, then, if it's green choose if to pay and steal, etc etc etc. (in case it's not obvious, the payout rules being equivalent to the OP)

As near as I can determine, this situation is entirely equivalent to the OP and is in no way an anthropic one. If the OP actually is an argument against anthropic updates in the presence of logical uncertainty... then it's actually an argument against the general case of Bayesian updating in the presence of logical uncertainty, even when there's no anthropic stuff going on at all!

EDIT: oh, in case it's not obvious, marbles are not replaced after being drawn from the bucket.

Comment author: Christian_Szegedy 09 September 2009 06:13:02PM *  1 point [-]

Very enlightening!

It just shows that the OP was an overcomplicated example generating confusion about the update.

[EDIT] Deleted rest of the comment due to revised opinion here: http://lesswrong.com/lw/17c/outlawing_anthropics_an_updateless_dilemma/13hk

Comment author: SilasBarta 09 September 2009 03:56:04PM 0 points [-]

Good point. After thinking about this for a while, I feel comfortable simultaneously holding these views:

1) You shouldn't do anthropic updates. (i.e. update on the fact that you exist)

2) The example posed in the top-level post is not an example of anthropic reasoning, but reasoning on specific givens and observations, as are most supposed examples of anthropic reasoning.

3) Any evidence arising from the fact that you exist is implicitly contained by your observations by virtue of their existence.

Wikipedia gives one example of a productive use of the anthropic principle, but it appears to be reasoning based on observations of the type of life-form we are, as well as other hard-won biochemical knowledge, well above and beyond the observation that we exist.

Comment author: Psy-Kosh 09 September 2009 04:18:41PM 1 point [-]

Thanks.

I don't THINK I agree with your point 1. ie, I favor saying yes to anthropic updates, but I admit that there's definitely confusing issues here.

Mind expanding on point 3? I think I get what you're saying, but in general we filter out that part our observations, that is, the fact that observations are occurring at all, Getting that back is the point of anthropic updating. Actually... IIRC, Nick Bostrom's way of talking about anthropic updates more or less is exactly your point 3 in reverse... ie, near as I can determine and recall, his position explicitly advocates talking about the significance that observations are occurring at all as part of the usual update based on observation. Maybe I'm misremembering though.

Also, separating it out into a single anthropic update and then treating all observations as conditional on your existence or such helps avoid double counting that aspect, right?

Also, here's another physics example, a bit more recent that was discussed on OB a while back.

Comment author: SilasBarta 09 September 2009 05:29:02PM 0 points [-]

Reading the link, the second paper's abstract, and most of Scott Aaronson's post, it looks to me like they're not using anthropic reasoning at all. Robin Hanson summarizes their "entropic principle" (and the abstract and all discussion agree with his summary) as

since observers need entropy gains to function physically, we can estimate the probability that any small spacetime volume contains an observer to be proportional to the entropy gain in that volume.

The problem is that "observer" is not the same as "anthrop-" (human). This principle is just a subtle restatement of either a tautology or known physical law. Because it's not that "observers need entropy gains". Rather, observation is entropy gain. To observe something is to increase one's mutual information with it. But since phase space is conserved, all gains in mutual information must be offset by an increase in entropy.

But since "observers" are simply anything that forms mutual information with something else, it doesn't mean a conscious observer, let alone a human one. For that, you'd need to go beyond P(entropy gain|observer) to P(consciousness|entropy gain).

(I'm a bit distressed no one else made this point.)

Now, this idea could lead to an insight if you endorsed some neo-animistic view that consciousness is proportional to normalized rate of mutual information increase, and so humans are (as) conscious (as we are) because we're above some threshold ... but again, you'd be using nothing from your existence as such.

Comment author: Psy-Kosh 09 September 2009 06:53:57PM 0 points [-]

The argument was "higher rate of entropy production is correlated with more observers, probably. So we should expect to find ourselves in chunks of reality that have high rates of entropy production"

I guess it wasn't just observers, but (non reversible) computations

ie, anthropic reasoning was the justification for using the entropy production criteria in the first place. Yes, there is a question of fractions of observers that are conscious, etc... but a universe that can't support much in the way of observers at all probably can't support much in the way of conscious observers, while a universe that can support lots of observers can probably support more conscious observers than the other, right?

Or did I misunderstand your point?

Comment author: SilasBarta 09 September 2009 07:13:10PM 0 points [-]

Now I'm not understanding how your response applies.

My point was: the entropic principle estimates the probability of observers per unit volume by using the entropy per unit volume. But this follows immediately from the second law and conservation of phase space; it's necessarily true.

To the extent that it assigns a probability to a class that includes us, it does a poor job, because we make up a tiny fraction of the "observers" (appropriately defined) in the universe.

Comment author: Vladimir_Nesov 14 September 2009 07:52:40AM *  1 point [-]

Right, and this is a perspective very close to intuition for UDT: you consider different instances of yourself at different times as separate decision-makers that all share the common agenda ("global strategy"), coordinated "off-stage", and implement it without change depending on circumstances they encounter in each particular situation. The "off-stageness" of coordination is more naturally described by TDT, which allows considering different agents as UDT-instances of the same strategy, but the precise way in which it happens remains magic.

Comment author: Eliezer_Yudkowsky 14 September 2009 06:09:45PM 1 point [-]

Nesov, the reason why I regard Dai's formulation of UDT as such a significant improvement over your own is that it does not require offstage coordination. Offstage coordination requires a base theory and a privileged vantage point and, as you say, magic.

Comment author: Vladimir_Nesov 20 September 2009 02:33:19PM *  1 point [-]

Nesov, the reason why I regard Dai's formulation of UDT as such a significant improvement over your own is that it does not require offstage coordination. Offstage coordination requires a base theory and a privileged vantage point and, as you say, magic.

I still don't understand this emphasis. Here I sketched in what sense I mean the global solution -- it's more about definition of preference than the actual computations and actions that the agents make (locally). There is an abstract concept of global strategy that can be characterized as being "offstage", but there is no offstage computation or offstage coordination, and in general complete computation of global strategy isn't performed even locally -- only approximations, often approximations that make it impossible to implement the globally best solution.

In the above comment, by "magic" I referred to exact mechanism that says in what way and to what extent different agents are running the same algorithm, which is more in the domain of TDT, UDT generally not talking about separate agents, only different possible states of the same agent. Which is why neither concept solves the bargaining problem: it's out of UDT's domain, and TDT takes the relevant pieces of the puzzle as given, in its causal graphs.

For further disambiguation, see for example this comment you made:

We're taking apart your "mathematical intuition" into something that invents a causal graph (this part is still magic) and a part that updates a causal graph "given that your output is Y" (Pearl says how to do this).

Comment author: Eliezer_Yudkowsky 09 September 2009 06:18:33PM 1 point [-]

Again, if we randomly selected someone to ask, rather than having specified in advance that we're going to make the decision depend on the unanimous response of all people in green rooms, then there would be no paradox. What you're talking about here, pulling out a random marble, is the equivalent of asking a random single person from either green or red rooms. But this is not what we're doing!

Comment author: Psy-Kosh 09 September 2009 06:40:49PM 6 points [-]

Either I'm misunderstanding something, or I wasn't clear.

To make it explicit: EVERYONE who gets a green marble gets asked, and the outcome depends their consent being unanimous, just like everyone who wakes up in a green room gets asked. ie, all twenty rationalists draw a marble from the bucket, so that by the end, the bucket is empty.

Everyone who got a green marble gets asked for their decision, and the final outcome depends on all the answers. The bit about them drawing marbles individually is just to keep them from seeing what marbles the others got or being able to talk to each other once the marble drawing starts.

Unless I completely failed to comprehend some aspect of what's going on here, this is effectively equivalent to the problem you described.

Comment author: Eliezer_Yudkowsky 09 September 2009 07:29:17PM 8 points [-]

Oh, okay, that wasn't clear actually. (Because I'm used to "they" being a genderless singular pronoun.) In that case these problems do indeed look equivalent.

Hm. Hm hm hm. I shall have to think about this. It is a an extremely good point. The more so as anyone who draws a green marble should indeed be assigning a 90% probability to there being a mostly-green bucket.

Comment author: Psy-Kosh 09 September 2009 07:44:58PM *  2 points [-]

Sorry about the unclarity then. I probably should have explicitly stated a step by step "marble game procedure".

My personal suggestion if you want an "anthropic reasoning is confooozing" situation would be the whole anthropic updating vs aumann agreement thing, since the disagreement would seem to be predictable in advance, and everyone involved would appear to be able to be expected to agree that the disagreement is right and proper. (ie, mad scientist sets up a quantum suicide experiment. Test subject survives. Test subject seems to have Bayesian evidence in favor of MWI vs single world, external observer mad scientist who sees the test subject/victim survive would seem to not have any particular new evidence favoring MWI over single world)

(Yes, I know I've brought up that subject several times, but it does seem, to me, to be a rather more blatant "something funny is going on here")

(EDIT: okay, I guess this would count as quantum murder rather than quantum suicide, but you know what I mean.)

Comment author: byrnema 10 September 2009 02:48:03AM 0 points [-]

I don't see how being assigned a green or red room is "anthropic" while being assigned a green or red marble is not anthropic.

I thought the anthropic part came from updating on your own individual experience in the absence of observing what observations others are making.

Comment author: Psy-Kosh 10 September 2009 03:16:30AM 3 points [-]

The difference wasn't marble vs room but "copies of one being, so number of beings changed" vs "just gather 20 rationalists..."

But my whole point was "the original wasn't really an anthropic situation, let me construct this alternate yet equivalent version to make that clear"

Comment author: CarlShulman 10 September 2009 06:05:59AM 1 point [-]

Do you think that the Sleeping Beauty problem is an anthropic one?

Comment author: Psy-Kosh 11 September 2009 06:00:15AM 0 points [-]

It probably counts as an instance of the general class of problems one would think of as an "anthropic problem".

Comment author: byrnema 10 September 2009 04:03:25AM 0 points [-]

I see. I had always thought of the problem as involving 20 (or sometimes 40) different people. The reason for this is that I am an intuitive rather than literal reader, and when Eliezer mentioned stuff about copies of me, I just interpreted this as meaning to emphasize that each person has their own independent 'subjective reality'. Really only meaning that each person doesn't share observations with the others.

So all along, I thought this problem was about challenging the soundness of updating on a single independent observation involving yourself as though you are some kind of special reference frame.

... therefore, I don't think you took this element out, but I'm glad you are resolving the meaning of "anthropic" because there are probably quite a few different "subjective realities" circulating about what the essence of this problem is.

Comment author: Psy-Kosh 11 September 2009 06:05:01AM 0 points [-]

Sorry for delay.

Copies as in "upload your mind. then run 20 copies of the uploaded mind".

And yes, I know there's still tricky bits left in the problem, I merely established that those tricky bits didn't derive from effects like mind copying or quantum suicide or anything like that and could instead show up in ordinary simple stuff, with no need to appeal to anthropic principles to produce the confusion. (sorry if that came out babbly, am getting tired)

Comment author: brian_jaress 10 September 2009 06:58:40PM 0 points [-]

anyone who draws a green marble should indeed be assigning a 90% probability to there being a mostly-green bucket.

I don't think so. I think the answer to both these problems is that if you update correctly, you get 0.5.

Comment author: Psy-Kosh 11 September 2009 05:59:07AM *  3 points [-]

*blinks* mind expanding on that?

P(green|mostly green bucket) = 18/20

P(green|mostly red bucket) = 2/20

likelihood ratio = 9

if one started with no particular expectation of it being one bucket vs the other, ie, assigned 1:1 odds, then after updating upon seeing a green marble, one ought assign 9:1 odds, ie, probability 9/10, right?

Comment author: brian_jaress 11 September 2009 07:59:31AM 3 points [-]

I guess that does need a lot of explaining.

I would say:

P(green|mostly green bucket) = 1

P(green|mostly red bucket) = 1

P(green) = 1

because P(green) is not the probability that you will get a green marble, it's the probability that someone will get a green marble. From the perspective of the priors, all the marbles are drawn, and no one draw is different from any other. If you don't draw a green marble, you're discarded and the people who did get a green vote. For the purposes of figuring out the priors for a group strategy, your draw being green is not an event.

Of course, you know that you've drawn green. But the only thing you can translate it into that has a prior is "someone got green."

That probably sounds contrived. Maybe it is. But consider a slightly different example:

  • Two marbles and two people instead of twenty.
  • One marble is green, the other will be red or green based on a coin flip (green on heads, red on tails).

I like this example because it combines the two conflicting intuitions in the same problem. Only a fool would draw a red marble and remain uncertain about the coin flip. But someone who draws a green marble is in a situation similar to the twenty marble scenario.

If you were to plan ahead of time how the greens should vote, you would tell them to assume 50%. But a person holding a green marble might think it's 2/3 in favor of double green.

To avoid embarrassing paradoxes, you can base everything on the four events "heads," "tails," "someone gets green," and "someone gets red." Update as normal.

Comment author: Psy-Kosh 11 September 2009 08:10:31AM 2 points [-]

yes, the probability that someone will get a green marble is rather different than the probability that I, personally, will get a green marble. But if I do personally get a green marble, that's evidence in favor of green bucket.

The decision algorithm for how to respond to that though in this case is skewed due to the rules for the payout.

And in your example, if I drew green, I'd consider the 2/3 probability the correct one for whoever drew green.

Now, if there's a payout scheme involved with funny business, that may alter some decisions, but not magically change my epistemology.

Comment author: brian_jaress 11 September 2009 08:30:55AM 0 points [-]

What kind of funny business?

Comment author: wedrifid 11 September 2009 09:26:52AM 0 points [-]

Let's just say that you don't draw blue.

Comment author: Vladimir_Nesov 09 September 2009 05:04:42PM 0 points [-]

That uncertainty is logical seems to be irrelevant here.

Comment author: Psy-Kosh 09 September 2009 05:11:13PM 1 point [-]

Agreed. But I seem to recall seeing some comments about distinguishing between quantum and logical uncertainty, etc etc, so figured may as well say that it at least is equivalent given that it's the same type of uncertainty as in the original problem and so on...

Comment author: pengvado 08 September 2009 09:09:06PM 13 points [-]

Well, we don't want to build conscious AIs, so of course we don't want them to use anthropic reasoning.

Why is anthropic reasoning related to consciousness at all? Couldn't any kind of Bayesian reasoning system update on the observation of its own existence (assuming such updates are a good idea in the first place)?

Comment author: Marcello 09 September 2009 01:32:42PM 4 points [-]

Why do I think anthropic reasoning and consciousness are related?

In a nutshell, I think subjective anticipation requires subjectivity. We humans feel dissatisfied with a description like "well, one system running a continuation of the computation in your brain ends up in a red room and two such systems end up in green rooms" because we feel that there's this extra "me" thing, whose future we need to account for. We bother to ask how the "me" gets split up, what "I" should anticipate, because we feel that there's "something it's like to be me", and that (unless we die) there will be in future "something it will be like to be me". I suspect that the things I said in the previous sentence are at best confused and at worst nonsense. But the question of why people intuit crazy things like that is the philosophical question we label "consciousness".

However, the feeling that there will be in future "something it will be like to be me", and in particular that there will be one "something it will be like to be me"<1> if taken seriously, forces us to have subjective anticipation, that is, to write probability distribution summing to one for which copy we end up as. Once you do that, if you wake up in a green room in Eliezer's example, you are forced to update to 90% probability that the coin came up heads (provided you distributed your subjective anticipation evenly between all twenty copies in both the head and tail scenarios, which really seems like the only sane thing to do.)

<1> Or, at least, the same amount of "something it is like to be me"-ness as we started with, in some ill-defined sense.

On the other hand, if you do not feel that there is any fact of the matter as to which copy you become, then you just want all your copies to execute whatever strategy is most likely to get all of them the most money from your initial perspective of ignorance of the coinflip.

Incidentally, the optimal strategy looks like an policy selected by updateless decision theory and not like any probability of the the coin having been heads or tails. <a href="http://lesswrong.com/lw/17c/outlawing_anthropics_an_updateless_dilemma/13d7">PlaidX</a> beat me to the counter-example for p=50%. Counter-examples of like PlaidX's will work for any p<90%, and counter-examples like Eliezer's will work for any p>50%, so that pretty much covers it. So, unless we want to include ugly hacks like responsibility, or unless we let the copies reason Goldenly (using Eliezer's original TDT) about each other's actions as tranposed versions of their own actions (which does correctly handle PlaidX's counter-example, but might break in more complicated cases where no isomorphism is apparent) there simply isn't a probability-of-heads that represents the right thing for the copies to do no matter the deal offered to them.

Comment author: timtyler 09 September 2009 09:14:42AM 0 points [-]

Consciousness is really just a name for having a model of yourself which you can reflect on and act on - plus a whole bunch of other confused interpretations which don't really add much.

To do anthropic reasoning you have to have a simple model of yourself which you can reason about.

Machines can do this too, of course, without too much difficulty. That typically makes them conscious, though. Perhaps we can imagine a machine performing anthropic reasoning while dreaming - i.e. when most of its actuators are disabled, and it would not normally be regarded as being conscious. However, then, how would we know about its conclusions?

Comment author: Wei_Dai 09 September 2009 09:39:56AM 11 points [-]

An AI that runs UDT wouldn't conclude that it was a Boltzmann or non-Boltzmann brain. For such an AI, the statement has no meaning, since it's always both. The closest equivalent would be "Most of the value I can create by making the right decision is concentrated in the vicinity of non-Boltzmann brains."

BTW, does my indexical uncertainty and the Axiom of Independence post make any more sense now?

Comment author: CarlShulman 09 September 2009 06:26:37PM 3 points [-]

This was my take after going through a similar analysis (with apples, not paperclips) at the SIAI summer intern program.

Comment author: Wei_Dai 09 September 2009 07:39:02PM 2 points [-]

It seems promising that several people are converging on the same "updateless" idea. But sometimes I wonder why it took so long, if it's really the right idea, given the amount of brainpower spent on this issue. (Take a look at http://www.anthropic-principle.com/profiles.html and consider that Nick Bostrom wrote "Investigations into the Doomsday Argument" in 1996 and then did his whole Ph.D. on anthropic reasoning, culminating in a book published in 2002.)

BTW, weren't the SIAI summer interns supposed to try to write one LessWrong post a week (or was it a month)? What happened to that plan?

Comment author: Eliezer_Yudkowsky 09 September 2009 08:46:37PM 4 points [-]

But sometimes I wonder why it took so long, if it's really the right idea, given the amount of brainpower spent on this issue.

People are crazy, the world is mad. Also inventing basic math is a hell of a lot harder than reading it in a textbook afterward.

Comment author: Wei_Dai 09 September 2009 09:59:29PM 2 points [-]

People are crazy, the world is mad.

I suppose you're referring to the fact that we are "designed" by evolution. But why did evolution create a species that invented the number field sieve (to give a random piece of non-basic math) before UDT? It doesn't make any sense.

Also inventing basic math is a hell of a lot harder than reading it in a textbook afterward.

In what sense is it "hard"? I don't think it's hard in a computational sense, like NP-hard. Or is it? I guess it goes back to the question of "what algorithm are we using to solve these types of problems?"

Comment author: Eliezer_Yudkowsky 10 September 2009 08:14:48AM 3 points [-]

No, I'm referring to the fact that people are crazy and the world is mad. You don't need to reach so hard for an explanation of why no one's invented UDT yet when many-worlds wasn't invented for thirty years.

Comment author: CarlShulman 16 September 2009 02:57:56AM 4 points [-]

I also don't think general madness is enough of an explanation. Both are counterintuitive ideas in areas without well-established methods to verify progress, e.g. building a working machine or standard mathematical proof techniques.

Comment author: CarlShulman 09 September 2009 08:20:24PM *  1 point [-]

The OB/LW/SL4/TOElist/polymathlist group is one intellectual community drawing on similar prior work that hasn't been broadly disseminated.

The same arguments apply with much greater force to the the causal decision theory vs evidential decision theory debate.

The interns wound up more focused on their group projects. As it happens, I had told Katja Grace that I was going to write up a post showing the difference between UDT and SIA (using my apples example which is isomorphic with the example above), but in light of this post it seems needless.

Comment author: Vladimir_Nesov 10 September 2009 12:35:28PM 4 points [-]

UDT is basically the bare definition of reflective consistency: it is a non-solution, just statement of the problem in constructive form. UDT says that you should think exactly the same way as the "original" you thinks, which guarantees that the original you won't be disappointed in your decisions (reflective consistency). It only looks good in comparison to other theories that fail this particular requirement, but otherwise are much more meaningful in their domains of application.

TDT fails reflective consistency in general, but offers a correct solution in a domain that is larger than those of other practically useful decision theories, while retaining their expressivity/efficiency (i.e. updating on graphical models).

Comment author: Wei_Dai 09 September 2009 09:27:46PM 2 points [-]

The OB/LW/SL4/TOElist/polymathlist group is one intellectual community drawing on similar prior work that hasn't been broadly disseminated.

What prior work are you referring to, that hasn't been broadly disseminated?

The same arguments apply with much greater force to the the causal decision theory vs evidential decision theory debate.

I think much less brainpower has been spent on CDT vs EDT, since that's thought of as more of a technical issue that only professional decision theorists are interested in. Likewise, Newcomb's problem is usually seen as an intellectual curiosity of little practical use. (At least that's what I thought until I saw Eliezer's posts about the potential link between it and AI cooperation.)

Anthropic reasoning, on the other hand, is widely known and discussed (I remember the Doomsday Argument brought up during a casual lunch-time conversation at Microsoft), and thought to be both interesting in itself and having important applications in physics.

The interns wound up more focused on their group projects.

I miss the articles they would have written. :) Maybe post the topic ideas here and let others have a shot at them?

Comment author: CarlShulman 13 September 2009 04:28:08AM *  1 point [-]

"What prior work are you referring to, that hasn't been broadly disseminated?"

I'm thinking of the corpus of past posts on those lists, which bring certain tools and concepts (Solomonoff Induction, anthropic reasoning, Pearl, etc) jointly to readers' attention. When those tools are combined and focused on the same problem, different forum participants will tend to use them in similar ways.

Comment author: CarlShulman 09 September 2009 11:51:33PM 1 point [-]

You might think that more top-notch economists and game theorists would have addressed Newcomb/TDT/Hofstadter superrationality given their interest in the Prisoner's Dilemma.

Looking at the actual literature on the Doomsday argument, there are some physicists involved (just as some economists and others have tried their hands at Newcomb), but it seems like more philosophers. And anthropics doesn't seem core to professional success, e.g. Tegmark can indulge in it a bit thanks to showing his stuff in 'hard' areas of cosmology.

Comment author: Wei_Dai 10 September 2009 07:15:07AM 2 points [-]

I just realized/remembered that one reason that others haven't found the TDT/UDT solutions to Newcomb/anthropic reasoning may be that they were assuming a fixed human nature, whereas we're assuming an AI capable of self-modification. For example, economists are certainly more interested in answering "What would human beings do in PD?" than "What should AIs do in PD assuming they know each others' source code?" And perhaps some of the anthropic thinkers (in the list I linked to earlier) did invent something like UDT, but then thought "Human beings can never practice this, I need to keep looking."

Comment author: KatjaGrace 14 September 2009 04:18:42AM 0 points [-]

This post is an argument against voting on your updated probability when there is a selection effect such as this. It applies to any evidence (marbles, existence etc), but only in a specific situation, so has little to do with SIA, which is about whether you update on your own existence to begin with in any situation. Do you have arguments against that?

Comment author: CarlShulman 14 September 2009 01:12:27PM 0 points [-]

It's for situations in which different hypotheses all predict that there will be beings subjectively indistinguishable from you, which covers the most interesting anthropic problems in my view. I'll make some posts distinguishing SIA, SSA, UDT, and exploring their relationships when I'm a bit less busy.

Comment author: KatjaGrace 15 September 2009 05:04:26AM 0 points [-]

Are you saying this problem arises in all situations where multiple beings in multiple hypotheses make the same observations? That would suggest we can't update on evidence most of the time. I think I must be misunderstanding you. Subjectively indistinguishable beings arise in virtually all probabilistic reasoning. If there were only one hypothesis with one creature like you, then all would be certain.

The only interesting problem in anthropics I know of is whether to update on your own existence or not. I haven't heard a good argument for not (though I still have a few promising papers to read), so I am very interested if you have one. Will 'exploring their relationships' include this?

Comment author: CarlShulman 15 September 2009 01:32:21PM *  0 points [-]

You can judge for yourself at the time.

Comment author: LauraABJ 12 October 2009 12:17:46AM 4 points [-]

"I've made sacrifices! You don't know what it cost me to climb into that machine every night, not knowing if I'd be the man in the box or in the prestige!"

sorry- couldn't help myself.

Comment author: pjeby 12 October 2009 02:13:30AM *  0 points [-]

"I've made sacrifices! You don't know what it cost me to climb into that machine every night, not knowing if I'd be the man in the box or in the prestige!"

You know, I never could make sense out of that line. If you assume the machine creates "copies" (and that's strongly implied by the story up to that point), then that means every time he gets on stage, he's going to wind up in the box. (And even if the copies are error-free and absolutely interchangeable, one copy will still end up in the box.)

(Edit to add: of course, if you view it from the quantum suicide POV, "he" never ends up in the box, since otherwise "he" would not be there to try again the next night.)

Comment author: Yvain 10 September 2009 05:02:50PM *  4 points [-]

More thinking out loud:

It really is in your best interest to accept the offer after you're in a green room. It really is in your best interest to accept the offer conditional on being in a green room before you're assigned. Maybe part of the problem arises because you think your decision will influence the decision of others, ie because you're acting like a timeless decision agent. Replace "me" with "anyone with my platonic computation", and "I should accept the offer conditional on being in a green room" with "anyone with my platonic computation should accept the offer, conditional on anyone with my platonic computation being in a green room." But the chances of someone with my platonic computation being in a green room is 100%. Or, to put it another way, the Platonic Computation is wondering "Should I accept the offer conditional on any one of my instantiations being in a green room?". But the Platonic Computation knows that at least one of its instantiations will be in a green room, so it declines the offer. If the Platonic Computation was really a single organism, its best option would be to single out one of its instantiations before-hand and decide "I will accept the offer, given that Instantiation 6 is in a green room" - but since most instantiations of the computation can't know the status of Instantiation 6 when they decide, it doesn't have this option.

Comment author: byrnema 10 September 2009 05:44:35PM *  1 point [-]

Yes, exactly.

If you are in a green room and someone asks you if you will bet that a head was flipped, you should say "yes".

However, if that same person asks you if they should bet that heads was flipped, you should answer no if you ascertain that they asked you on the precondition that you were in a green room.

  • the probability of heads | you are in green room = 90%

  • the probability of you betting on heads | you are green room = 100% = no information about the coin flip

Comment author: Jonathan_Lee 10 September 2009 10:14:58PM 0 points [-]

Your first claim needs qualifications: You should only bet if you're being drawn randomly from everyone. If it is known that one random person in a green room will be asked to bet, then if you wake up in a green room and are asked to bet you should refuse.

P(Heads | you are in a green room) = 0.9 P(Being asked | Heads and Green) = 1/18, P(Being asked | Tails and Green) = 1/2 Hence P(Heads | you are asked in a green room) = 0.5

Of course the OP doesn't choose a random individual to ask, or even a random individual in a green room. The OP asks all people in green rooms in this world.

If there is confusion about when your decision algorithm "chooses", then TDT/UDT can try to make the latter two cases equivalent, by thinking about the "other choices I force". Of course the fact that this asserts some variety of choice for a special individual and not for others, when the situation is symmetric, suggests something is being missed.

What is being missed, to my mind, is a distinction between the distribution of (random individuals | data is observed), and the distribution of (random worlds | data is observed).

In the OP, the latter distribution isn't altered by the update as the observed data occurs somewhere with probability 1 in both cases. The former is because it cares about the number of copies in the two cases.

Comment author: Jonathan_Lee 10 September 2009 03:34:02AM *  4 points [-]

I've been watching for a while, but have never commented, so this may be horribly flawed, opaque or otherwise unhelpful.

I think the problem is entirely caused by the use of the wrong sets of belief, and that anything holding to Eliezer's 1-line summary of TDT or alternatively UDT should get this right.

Suppose that you're a rational agent. Since you are instantiated in multiple identical circumstances (green rooms) and asked identical questions, your answers should be identical. Hence if you wake up in a green room and you're asked to steal from the red rooms and give to the green rooms, you either commit a group of 2 of you to a loss of 52 or commit a group of 18 of you to a gain of 12.

This committal is what you wish to optimise over from TDT/UDT, and clearly this requires knowledge about the likelyhood of different decision making groups. The distribution of sizes of random groups is not the same as the distribution of sizes of groups that a random individual is in. The probabilities of being in a group are upweighted by the size of the group and normalised. This is why Bostrom's suggested 1/n split of responsibility works; it reverses the belief about where a random individual is in a set of decision making groups to a belief about the size of a random decision making group.

By the construction of the problem the probability that a random (group of all the people in green rooms) has size 18 is 0.5, and similarly for 2 the probability is 0.5. Hence the expected utility is (0.512)+(0.5-52)=-20.

If you're asked to accept a bet on there being 18 people in green rooms, and you're told that only you're being offered it, then the decision commits exactly one instance of you to a specific loss or gain, regardless of the group you're in. Hence you can't do better than the 0.9 and 0.1 beliefs.

If you're told that the bet is being offered to everyone in a green room, then you are committing to n times the outcome in any group of n people. In this case gains are conditional on group size, and so you have to use the 0.5-0.5 belief about the distribution of groups. It doesn't matter because the larger groups have the larger multiplier and thus shutting up and multiplying yields the same answers as a single-shot bet.

ETA: At some level this is just choosing an optimal output for your calculation of what to do, given that the result is used variably widely.

Comment author: Christian_Szegedy 10 September 2009 03:56:24AM *  1 point [-]

This committal is what you wish to optimise over from TDT/UDT, and clearly this requires knowledge about the likelyhood of different decision making groups.

I was influenced by the OP and used to think that way. However I think now, that this is not the root problem.

What if the agents get more complicated decision problems: for example, rewards depending on the parity of the agents voting certain way, etc.?

I think, what essential is that the agents have to think globally (categorical imperative, hmmm?)

Practically: if the agent recognizes that there is a collective decision, then it should model all available conceivable protocols (but making apriori sure that all cooperating agents perform the same or compatible analysis, if they can't communicate) and then they should choose the protocol with best overall total gain. In the case of the OP: the second calculation in the OP. (Not messing around with correction factors based on responsibilities, etc.)

Special considerations based on group sizes etc. may be incidentally correct in certain situations, but this is just not general enough. The crux is that the ultimate test is simply the expected value computation for the protocol of the whole group.

Comment author: Jonathan_Lee 10 September 2009 11:52:00AM 1 point [-]

Between non communicating copies of your decision algorithm, it's forced that every instance comes to the same answers/distributions to all questions, as otherwise Eliezer can make money betting between different instances of the algorithm. It's not really a categorical imperative, beyond demanding consistency.

The crux of the OP is asking for a probability assessment of the world, not whether the DT functions.

I'm not postulating 1/n allocation of responsibility; I'm stating that the source of the confusion is over: P(A random individual is in a world of class Ai | Data) with P(A random world is of class Ai | Data) And that these are not equal if the number of individuals with access to Data are different in distinct classes of world.

Hence in this case, there are 2 classes of world, A1 with 18 Green rooms and 2 Reds, and A2 with 2 Green rooms and 18 Reds.

P(Random individual is in the A1 class | Woke up in a green room) = 0.9 by anthropic update. P(Random world is in the A1 class | Some individual woke up in a green room) = 0.5

Why? Because in A1 there 18/20 individuals fit the description "Woke up in a green room", but in A2 only 2/20 do.

The crux of the OP is that neither a 90/10 nor 50/50 split seem acceptable, if betting on "Which world-class an individual in a Green room is in" and "Which world-class the (set of all individuals in Green rooms which contains this individual) is in" are identical. I assert that they are not. The first case is 0.9/0.1 A1/A2, the second is 0.5/0.5 A1/A2.

Consider a similar question where a random Green room will be asked. If you're in that room, you update both on (Green walls) and (I'm being asked) and recover the 0.5/0.5, correctly. This is close to the OP as if we wildly assert that you and only you have free will and force the others, then you are special. Equally in cases where everyone is asked and plays separately, you have 18 or 2 times the benefits depending on whether you're in A1 or A2.

If each individual Green room played separately, then you update on (Green walls), but P(I'm being asked|Green) = 1 in either case. This is betting on whether there are 18 people in green rooms or 2, and you get the correct 0.9/0.1 split. To reproduce the OP the offers would need to be +1/18 to Greens and -3/18 from Reds in A1, and +1/2 to Greens and -3/2 from Reds in A2, and then you'd refuse to play, correctly.

Comment deleted 10 September 2009 09:11:55AM *  [-]
Comment author: Christian_Szegedy 10 September 2009 06:00:51PM 0 points [-]

It's not about complexity, it is just expected total gain. Simply the second calculation of the OP.

I just argued, that the second calculation is right and that is what the agents should do in general. (unless they are completely egoistic for their special copies)

Comment author: CarlShulman 10 September 2009 06:16:19AM 0 points [-]

"Hence if you wake up in a green room and you're asked to steal from the red rooms and give to the green rooms, you either commit a group of 2 of you to a loss of 52 or commit a group of 18 of you to a gain of 12."

In the example you care equally about the red room and green room dwellers.

Comment author: Jonathan_Lee 10 September 2009 10:38:17AM 0 points [-]

Hence if there are 2 instances of your decision algorithm in Green rooms, there are 2 runs of your decision algorithm, and if they vote to steal there is a loss of 3 from each red and gain 1 for each green, for a total gain of 12-318 = - 52.

If there are 18 instances in Green rooms, there are 18 runs of your decision algorithm, and if they vote to steal there is a loss of 3 from each red and a gain of 1for each green, for a total gain of 118-23 = 12

The "committal of a group of" is noting that there are 2 or 18 runs of your decision algorithm that are logically forced by the decision made this specific instance of the decision algorithm in a green room.

Comment author: Yvain 09 September 2009 08:07:55PM *  4 points [-]

Curses on this problem; I spent the whole day worrying about it, and am now so much of a wreck that the following may or may not make sense. For better or worse, I came to a similar conclusion of Psy-Kosh: that this could work in less anthropic problems. Here's the equivalent I was using:

Imagine Omega has a coin biased so that it comes up the same way nine out of ten times. You know this, but you don't know which way it's biased. Omega allows you to flip the coin once, and asks for your probability that it's biased in favor of heads. The coin comes up heads. You give your probability as 9/10.

Now Omega takes 20 people and puts them in the same situation as in the original problem. It lets each of them flip their coins. Then it goes to each of the people who got tails, and offers $1 to charity for each coin that came up tails, but threatens to steal $3 from charity for each coin that came up heads.

This nonanthropic problem works the same way as the original anthropic problem. If the coin is really biased heads, 18 people will get heads and 2 people will get tails. In this case,the correct subjective probability to assign is definitely 9/10 in favor of whatever result you got; after all, this is the correct probability when you're the only person in the experiment, and just knowing that 19 other people are also participating in the experiment shouldn't change matters.

I don't have a formal answer for why this happens, but I can think of one more example that might throw a little light on it. In another thread, someone mentioned that lottery winners have excellent evidence that they are brains-in-a-vat and that the rest of the world is an illusion being put on by the Dark Lord of the Matrix for their entertainment. After all, if this was true, it wouldn't be too unlikely for them to win the lottery, so for a sufficiently large lottery, the chance of winning it this way exceeds the chance of winning it through luck.

Suppose Bob has won the lottery and so believes himself to be a brain in a vat. And suppose that the evidence for the simulation argument is poor enough that there is no other good reason to believe yourself to be a brain in a vat. Omega goes up to Bob and asks him to take a bet on whether he is a brain in a vat. Bob says he is, he loses, and Omega laughs at him. What did he do wrong? Nothing. Omega was just being mean by specifically asking the one person whom ve knew would get the answer wrong.

Omega's little prank would still work if ve announced ver intention to perform it beforehand. Ve would say "When one of you wins the lottery, I will be asking this person to take a bet whether they are a brain in a vat or not!" Everyone would say "That lottery winner shouldn't accept Omega's bet. We know we're not brains in vats." Then someone wins the lottery, Omega asks if they're a brain in a vat, and they say yes, and Omega laughs at them (note that this also works if we consider a coin with a bias such that it lands the same way 999999 out of a million times, let a million people flip it once, and ask people what they think the coin's bias is, asking the people who get the counter-to-expectations result more often than chance.)

Omega's being equally mean in the original problem. There's a 50% chance ve will go and ask the two out of twenty people who are specifically most likely to be wrong and can't do anything about it. The best course I can think of would be for everyone to swear an oath not to take the offer before they got assigned into rooms.

Comment author: Eliezer_Yudkowsky 09 September 2009 08:53:54PM 1 point [-]

Then someone wins the lottery, Omega asks if they're a brain in a vat, and they say yes, and Omega laughs at them

By assumption, if the person is right to believe they're in a sim, then most of the lottery winners are in sims, so while Omega laughs at them in our world, they win the bet with Omega in most of their worlds.

wrong and can't do anything about it

should have been your clue to check further.

Comment author: Yvain 10 September 2009 02:30:44PM *  1 point [-]

This is a feature of the original problem, isn't it?

Let's say there are 1000 brains in vats, each in their own little world, and a "real" world of a billion people. The chance of a vat-brain winning the lottery is 1, and the chance of a real person winning the lottery is 1 in a million. There are 1000 real lottery winners and 1000 vat lottery winners, so if you win the lottery your chance of being in a vat is 50-50. However, if you look at any particular world, the chances of this week's single lottery winner being a brain in a vat is 1000/1001.

Assume the original problem is run multiple times in multiple worlds, and that the value of pi somehow differs in those worlds (probably you used pi precisely so people couldn't do this, but bear with me). Of all the people who wake up in green rooms, 18/20 of them will be right to take your bet. However, in each particular world, the chances of the green room people being right to take the bet is 1/2.

In this situation there is no paradox. Most of the people in the green rooms come out happy that they took the bet. It's only when you limit it to one universe that it becomes a problem. The same is true of the lottery example. When restricted to a single (real, non-vat) universe, it becomes more troublesome.

Comment author: Eliezer_Yudkowsky 09 September 2009 08:52:26PM 1 point [-]

Now Omega takes 20 people and puts them in the same situation as in the original problem. It lets each of them flip their coins. Then it goes to each of the people who got tails, and offers $1 to charity for each coin that came up tails, but threatens to steal $3 from charity for each coin that came up heads.

It's worth noting that if everyone got to make this choice separately - Omega doing it once for each person who responds - then it would indeed be wise for everyone to take the bet! This is evidence in favor of either Bostrom's division-of-responsibility principle, or byrnema's pointer-based viewpoint, if indeed those two views are nonequivalent.

Comment author: byrnema 09 September 2009 11:59:02PM *  0 points [-]

Bostrom's calculation is correct, but I believe it is an example of multiplying by the right coefficients for the wrong reasons.

I did exactly the same thing -- multiplied by the right coefficients for the wrong reasons -- in my deleted comment. I realized that the justification of these coefficients required a quite different problem (in my case, I modeled that all the green roomers decided to evenly divide the spoils of the whole group) and the only reason it worked was because multiplying the first term by 1/18 and the next term by 1/2 meant you were effectively canceling away that the factors the represented your initial 90% posterior, and thus ultimately just applying the 50/50 probability of the non-anthropic solution.

Anthropic calculation:

18/20(12)+2/20(-52) = 5.6

Bostrom-modified calculation for responsibility per person:

[18/20(12)/18+2/20(-52)/2] / 2 = -1

Non-anthropic calculation for EV per person:

[1/2(12)+1/2(-52)] /20 = -1

My pointer-based viewpoint, in contrast, is not a calculation but a rationale for why you must use the 50/50 probability rather than the 90/10 one. The argument is that each green roomer cannot use the information that they were in a green room because this information was preselected (a biased sample). With effectively no information about what color room they're in, each green roomer must resort to the non-anthropic calculation that the probability of flipping heads is 50%.

Comment author: Christian_Szegedy 10 September 2009 02:52:39AM *  1 point [-]

I can very much relate to Eliezer's original gut reaction: I agree that Nick's calculation is very ad hoc and hardly justifiable.

However, I also think that, although you are right about the pointer bias, your explanation is still incomplete.

I think Psi-kosh made an important step with his reformulation. Especially eliminating the copy procedure for the agents was essential. If you follow through the math from the point of view of one of the agents, the nature of the problem becomes clear:

Trying to write down the payoff matrix from the viewpoint of one of the agents, it becomes clear that you can't fill out any of the reward entries, since the outcome never depends on that agent's decision alone. If he got a green marble, it still depends on other agents decision and if he drew a red one, it will depend only on other agent's decision.

This makes it completely clear that the only solution is for the agents is to agree on a predetermined protocol and therefore the second calculation of the OP is the only correct one so far.

However, this protocol does not imply anything about P(head|being in green room). It is simply irrelevant for the expected value of any of the agreed upon protocol. One could create a protocol that depends on P(head|being in a green room) for some of the agents, but you would have to analyze the expected value of the protocol from a global point of view, not just from the point of view of the agent, for you can't complete the decision matrix if the outcome depends on other agent's decisions as well.

Of course a predetermined protocol does not mean that the agents must explicitly agree on a narrow protocol before the action. If we assume that the agents get all the information once they find themselves in the room, they could still create a mental model of the whole global situation and base their decision on the second calculation of the OP.

Comment author: byrnema 10 September 2009 04:39:19AM *  1 point [-]

I agree with you that the reason why you can't use the 90/10 prior is because the decision never depends on a person in a red room.

In Eliezer's description of the problem above, he tells each green roomer that he asks all the green roomers if they want him to go ahead with a money distribution scheme, and they must be unanimous or there is a penalty.

I think this is a nice pedogogical component that helps a person understand the dilemma, but I would like to emphasize here (even if you're aware of it) that it is completely superfluous to the mechanics of the problem. It doesn't make any difference if Eliezer bases his action on the answer of one green roomer or all of them.

For one thing, all green roomer answers will be unanimous because they all have the same information and are asked the same complicated question.

And, more to the point, even if just one green roomer is asked, the dilemma still exists that he can't use his prior that heads was probably flipped.

Comment author: Christian_Szegedy 10 September 2009 04:42:23AM *  2 points [-]

Agreed 100%.

[EDIT:] Although I would be a bit more general: regardless of red rooms: if you have several actors, even if they necessarily make the same decision they have to analyze the global picture. The only situation when the agent should be allowed to make the simplified subjective Bayesian decision table analysis if he is the only actor (no copies, etc. It is easy to construct simple decision problems without "red rooms": Where each of the actors have some control over the outcome and none of them can make the analysis for itself only but have to buid a model of the whole situation to make the globally optimal decision.)

However, I did not imply in any way that the penalty matters. (At least, as long as the agents are sane and don't start to flip non-logical coins) The global analysis of the payoff may clearly disregard the penalty case if it's impossible for that specific protocol. The only requirement is that the expected value calculation must be made protocol by protocol basis.

Comment author: AllanCrossman 08 September 2009 08:11:21PM *  4 points [-]

I think I'm with Bostrom.

The problem seems to come about because the good effects of 18 people being correct are more than wiped out by the bad effects of 2 people being wrong.

I'm sure this imbalance in the power of the agents has something to do with it.

Comment author: JGWeissman 09 September 2009 04:13:44AM 1 point [-]

What if, instead of requiring agreement of all copies in a green room, one copy in a green room was chosen at random to make the choice?

Comment author: JGWeissman 09 September 2009 05:00:09AM 5 points [-]

In this case the chosen copy in the green room should update on the anthropic evidence of being chosen to make the choice. That copy had a 1/18 probability of being chosen if the coin flip came up heads, and a 1/2 probability of being chosen if the coin flip came up tails, so the odds of heads:tails should be updated from 9:1 to 1:1. This exactly canceled the anthropic evidence of being in a green room.

Comment deleted 09 September 2009 04:29:25AM *  [-]
Comment author: JGWeissman 09 September 2009 04:52:39AM 1 point [-]

They are not equivalent. If one green room copy is chosen at random, then the game will be played exactly once whether the coin flip resulted in heads or tails. But if every green room copy plays, the the game will be played 18 times if the coin came up heads and 2 times if the coin came up tails.

Comment author: timtyler 09 September 2009 08:58:05AM 4 points [-]

You can't reject the conclusion that you are a Boltzmann brain - but if you are, it doesn't matter what you do, so the idea doesn't seem to have much impact on decision theory.

Comment author: nshepperd 18 November 2012 04:02:19AM *  2 points [-]

Huh. Reading this again, together with byrnema's pointer discussion and Psy-Kosh's non-anthropic reformulation...

It seems like the problem is that whether each person gets to make a decision depends on the evidence they think they have, in such a way to make that evidence meaningless. To construct an extreme example: The Antecedent Mugger gathers a billion people in a room together, and says:

"I challenge you to a game of wits! In this jar is a variable amount of coins, between $0 and $10,000. I will allow each of you to weigh the jar using this set of extremely imprecise scales. Then I will ask each of you whether to accept my offer: to as a group buy the jar off me for $5000, the money to be distributed equally among you. Note: although I will ask all of you, the only response I will consider is the one given by the person with the greatest subjective expected utility from saying 'yes'."

In this case, even if the jar always contains $0, there will always be someone who receives enough information from the scales to think the jar contains >$5000 with high probability, and therefore to say yes. Since that person's response is the one that is taken for the whole group, the group always pays out $5000, resulting in a money pump in favour of the Mugger.

The problem is that, from an outside perspective, the observations of the one who gets to make the choice are almost completely uncorrelated from the actual contents of the jar, due to the Mugger's selection process. For any general strategy Observations → Response, the Mugger can always summon enough people to find someone who has seen the observations that will produce the response he wants, unless the strategy is a constant function.

Similarly, in the problem with the marbles, only the people with the observation Green get any influence, so the observations of "people who get to make a decision" are uncorrelated with the actual contents of the buckets (even though observations of the participants in general are correlated with the buckets).

Comment author: Kindly 18 November 2012 04:28:37AM 2 points [-]

The problem here is that your billion people are for some reason giving the answer most likely to be correct rather than the answer most likely to actually be profitable. If they were a little more savvy, they could reason as follows:

"The scales tell me that there's $6000 worth of coins in the jar, so it seems like a good idea to buy the jar. However, if I did not receive the largest weight estimate from the scales, my decision is irrelevant; and if I did receive the largest weight estimate, then conditioned on that it seems overwhelmingly likely that there are many fewer coins in the jar than I'd think based on that estimate -- and in that case, I ought to say no."

Comment author: nshepperd 18 November 2012 05:34:55AM 0 points [-]

Ooh, and we can apply similar reasoning to the marble problem if we change it, in a seemingly isomorphic way, so that instead of making the trade based on all the responses of the people who saw a green marble, Psy-Kosh selects one of the green-marble-observers at random and considers that person's response (this should make no difference to the outcomes, assuming that the green-marblers can't give different responses due to no-spontaneous-symmetry-breaking and all that).

Then, conditioning on drawing a green marble, person A infers a 9/10 probability that the bucket contained 18 green and 2 red marbles. However, if the bucket contains 18 green marbles, person A has a 1/18 chance of being randomly selected given that she drew a green marble, whereas if the bucket contains 2 green marbles, she has a 1/2 chance of being selected. So, conditioning on her response being the one that matters as well as the green marble itself, she infers a (9:1) * (1/18)/(1/2) = (9:9) odds ratio, that is probability 1/2 the bucket contains 18 green marbles.

Which leaves us back at a kind of anthropic updating, except that this time it resolves the problem instead of introducing it!

Comment author: tim 10 September 2009 12:06:15AM *  2 points [-]

isn't this a problem with the frequency you are presented with the opportunity to take the wager? [no, see edit]

the equation: (50% * ((18 * +$1) + (2 * -$3))) + (50% * ((18 * -$3) + (2 * +$1))) = -$20 neglects to take into account that you will be offered this wager nine times more often in conditions where you win than when you lose.

for example, the wager: "i will flip a fair coin and pay you $1 when it is heads and pay you -$2 when it is tails" is -EV in nature. however if a conditional is added where you will be asked if you want to take the bet 90% of the time given the coin is heads (10% of the time you are 'in a red room') and 10% of the time given the coin is tails (90% of the time you are 'in a red room'), your EV changes from (.5)(1) + (.5)(-2) = -.5 to (.5)(.9)($1) + (.5)(.1)(-$2) = $.35 representing the shift from "odds the coin comes up heads" to "odds the coin comes up heads and i am asked if i want to take the bet"

it seems like the same principle would apply to the green room scenario and your pre-copied self would have to conclude that though the two outcomes are +$12 or -$52, they do not occur with 50-50 frequency and given you are offered the bet, you have a 90% chance of winning. (.9)($12) + (.1)(-$52) = $5.6

EDIT: okay, after thinking about it, i am wrong. the reason i was having trouble with this was the fact that when the coin comes up tails and 90% of the time i am in a red room, even though "i" am not being specifically asked to wager, my two copies in the green rooms are - and they are making the wrong choice because of my precommitment to taking the wager given i am in a green room. this makes my final EV calculation wrong as it ignores trials where "i" appear in a red room even though the wager still takes place.

its interesting that this paradox exists because of entities other than yourself (copies of you, paperclip maximizers, etc) making the "incorrect" choice the 90% of the time you are stuck in a red room with no say.

Comment author: tim 11 September 2009 08:28:48PM *  0 points [-]

some other thoughts. the paradox exists because you cannot precommit yourself to taking the wager given you are in a green room as this commits you to taking the wager on 100% of coinflips which is terrible for you.

when you find yourself in a green room, the right play IS to take the wager. however, you can't make the right play without committing yourself to making the wrong play in every universe where the coin comes up tails. you are basically screwing your parallel selves over because half of them exist in a 'tails' reality. it seems like factoring in your parallel expectation cancels out the ev shift of adjusting you prior (50%) probability to 90%.

and if you don't care about your parallel selves, you can just think of them as the components that average to your true expectation in any given situation. if the overall effect across all possible universes was negative, it was a bad play even if it helped you in this universe. metaphysical hindsight.

Comment author: Psy-Kosh 08 September 2009 07:58:33PM 2 points [-]

I think I'll have to sit and reread this a couple times, but my INITIAL thought is "Isn't the apparent inconsistancy here qualitatively similar to the situation with a counterfactual mugging?"

Comment author: Nisan 08 September 2009 09:25:36PM 2 points [-]

This is my reaction too. This is a decision involving Omega in which the right thing to do is not update based on new information. In decisions not involving Omega, you do want to update. It doesn't matter whether the new information is of an anthropic nature or not.

Comment author: Psy-Kosh 08 September 2009 09:33:03PM 0 points [-]

Yeah, thought about it a bit more, and still seems to be more akin to "paradox of counterfactual mugging" than "paradox of anthropic reasoning"

To me, confusing bits of anthropic reasoning would more come into play via stuff like "aumann agreement theorem vs anthropic reasoning"

Comment author: Jack 08 September 2009 07:56:00PM 2 points [-]

If the many worlds interpretation of quantum mechanics is true isn't anthropic reasoning involved in making predictions about the future of quantum systems. There exists some world in which, from the moment this comment is posted onward, all attempts to detect quantum indeterminacy fail, all two-slit experiments yield two distinct lines instead of a wave pattern etc. Without anthropic reasoning we have no reason to find this result at all surprising. So either we need to reject anthropic reasoning or we need to reject the predictive value of quantum mechanics under the many worlds interpretation. Right?

(Apologies if this has been covered, I'm playing catch-up and just trying to hash things out for myself. Also should I expect to be declared a prophet in the world in which quantum indeterminacy disappears from here on out?)

Comment author: Douglas_Knight 09 September 2009 01:59:54AM *  2 points [-]

If the many worlds interpretation of quantum mechanics is true isn't anthropic reasoning involved in making predictions about the future of quantum systems.

Basic QM seems to say that probability is ontologically basic. In a collapse point of view, it's what we usually think of as probability that shows up in decision theory. In MWI, both events happen. But you could talk about usual probability either way. ("classical probability is a degenerate form of quantum probability" with or without collapse)

Anthropics is about the interaction of probability with the number of observers.

Replacing usual probability with QM doesn't seem to me to make a difference. Quantum suicide is a kind of anthropics, but it's not clear to me in what sense it's really quantum. It's mainly about rejecting the claim that the Born probabities are ontologically basic, that they measure how real an outcome is.

Comment author: Jack 09 September 2009 03:23:29AM 2 points [-]

But in MWI isn't the observed probability of some quantum state just the fraction of worlds in which an observer would detect that quantum state? As such, doesn't keeping the probabilities of quantum events as QM predicts require that "one should reason as if one were a random sample from the set of all observers in one’s reference class" (from a Nick Bostrom piece). The reason we think our theory of QM is right is that we think our branch in the multi-verse didn't get cursed with an unrepresentative set of observed phenomena.

Wouldn't a branch in the multi-verse that observed quantum events in which values were systematically distorted (by random chance) come up with slightly different equations to describe quantum mechanics? If so, what reason do we have to think that our equations are correct if we don't consider our observations to be similar to the observations made in other possible worlds?

Comment author: Psy-Kosh 09 September 2009 03:34:05AM 2 points [-]

It's not just world counting... (Although Robin Hanson's Mangled World's idea does suggest a way that it may turn out to amount to world counting after all)

essentially one has to integrate the squared modulus of quantum amplitude over a world. This is proportional to the subjective probability of experiencing that world.

Yes... that it isn't simple world counting does seem to be a problem. This is something that we, or at least I, am confused about.

Comment author: Jack 09 September 2009 04:13:00AM 1 point [-]

Thanks. Good to know. I don't suppose you can explain why it works that way?

Comment author: Psy-Kosh 09 September 2009 04:48:24AM 1 point [-]

As I said, that's something I'm confused about, and apparently others are as well.

We've got the linear rules for how quantum amplitude flows over configuration space, then we've got this "oh, by the way, the subjective probability of experiencing any chunk of reality is proportional to the square of the absolute value" rule.

There're a few ideas out there, but...

Comment author: Johnicholas 08 September 2009 10:30:37PM 0 points [-]

Would you expand and sharpen your point? Woit comes to mind.

At one point you claim, possibly based on MWI, that "there is some world in which ...<some unlikely scenario>". As far as I can tell, the specifics of the scenario shouldn't have anything to do with the correctness of your argument.

This is how I would paraphrase your comment:

  1. According to MWI, there exists some world in which unlikely things happen.
  2. We find this surprising.
  3. Anthropic reasoning is necessary to conclude 2.
  4. Anthropic reasoning is involved in making predictions about quantum systems.

In step 2: Who is the "we"? What is the "this"? Why do we find it surprising? In step 3: What do you mean by "anthropic reasoning"? In general, it is pretty hard metareasoning to conclude that a reasoning step or maneuver is necessary for a conclusion.

Comment author: Jack 09 September 2009 04:07:28AM 4 points [-]

We don't need anthropic reasoning under MWI in order to be surprised when finding ourselves in worlds in which unlikely things happen so much as we need anthropic reasoning to conclude that an unlikely thing has happened. And our ability to conclude that an unlikely thing has happened is needed to accept quantum mechanics as a successful scientific theory.

"We" is the set of observers in the worlds where events, declared to be unlikely by quantum mechanics actually happen. An observer is any physical system with a particular kind of causal relation to quantum states such that the physical system can record information about quantum states and use the information to come up with methods of predicting the probability of previously unobserved quantum processes (or something, but if we can't come up with a definition of observer then we shouldn't be talking about anthropic reasoning anyway).

  1. According to MWI, the (quantum) probability of a quantum state is defined as the fraction of worlds in which that state occurs.
  2. The only way an observer somewhere in the multi-verse can trust the observations used that confirm quantum mechanics probabilistic interpretations is if they reason as if they were a random sample from the set of all observers in the multi-verse (one articulation of anthropic reasoning) because if they can't do that then they have no reason to think their observations aren't wrong in a systematic way.
  3. An observer's reason for believing the standard model of QM to be true the first place is that they can predict atomic and subatomic particles behaving according a probabilistic wave-function.
  4. Observers lose their reason for trusting QM in the first place if they accept the MWI AND are prohibited reason anthropically.

In other words If MWI is likely, then QM is likely iff AR is acceptable.

I think one could write a different version of this argument by referencing expected surprise at discovering sudden changes in quantum probabilities (which I was conflating with the first argument in my first comment) but the above version is probably easier to follow.

Comment author: Johnicholas 12 September 2009 01:23:53AM 3 points [-]

Can I paraphrase what you just said as:

"If many-worlds is true, then all evidence is anthropic evidence"

Comment author: Jack 12 September 2009 06:34:00PM 0 points [-]

I hadn't come to that conclusion until you said it... but yes, that is about right. I'm not sure I would say all evidence is anthropic- I would prefer saying that all updating involves a step of anthropic reasoning. I make that hedge just because I don't know that direct sensory information is anthropic evidence, just that making good updates with that sensory information is going to involve (implicit) anthropic reasoning.

Comment author: Vladimir_Nesov 08 September 2009 08:57:15PM 4 points [-]

Again: how can you talk about concluding that you are a Boltzmann brain? To conclude means to update, and here you refuse updating.

Comment author: dfranke 08 September 2009 07:19:15PM 4 points [-]

I read this and told myself that it only takes five minutes to have an insight. Five minutes later, here's what I'm thinking:

Anthropic reasoning is confusing because it treats consciousness as a primitive. By doing so, we're committing LW's ultimate no-no: assuming an ontologically fundamental mental state. We need to find a way to reformulate anthropic reasoning in terms Solomonoff induction. If we can successfully do so, the paradox will dissolve.

Comment author: timtyler 09 September 2009 09:16:50AM 1 point [-]

Anthropic reasoning is confusing - probably because we are not used to doing it much in our ancestral environment.

I don't think you can argue it treats consciousness as a primitive, though. Anthropic reasoning is challenging - but not so tricky that machines can't do it.

Comment author: CarlShulman 09 September 2009 06:32:54PM 0 points [-]

It involves calculating a 'correct measure' of how many partial duplicates of a computation exist:

www.nickbostrom.com/papers/experience.pdf

Anthropics does involve magical categories.

Comment author: timtyler 09 September 2009 06:43:23PM -2 points [-]

Right - but that's "Arthur C Clark-style magic" - stuff that is complicated and difficult - not the type of magic associated with mystical mumbo-jumbo.

We can live with some of the former type of magic - and it might even spice things up a bit.

Comment author: SforSingularity 08 September 2009 08:46:44PM 0 points [-]

need to find a way to reformulate anthropic reasoning in terms Solomonoff induction

I fail to see how solomonoff can reduce ontologically basic mental states.

Comment author: RobinHanson 10 September 2009 03:09:11PM 3 points [-]

There are lots of ordinary examples in game theory of time inconsistent choices. Once you know how to resolve them, then if you can't use those approaches to resolve this I might be convinced that anthropic updating is at fault. But until then I think you are making a huge leap to blame anthropic updating for the time inconsistent choices.

Comment author: Wei_Dai 10 September 2009 04:24:59PM 5 points [-]

Robin, you're jumping into the middle of a big extended discussion. We're not only blaming anthropic updating, we're blaming Bayesian updating in general, and proposing a decision theory without it (Updateless Decision Theory, or UDT). The application to anthropic reasoning is just that, an application.

UDT seems to solve all cases of time inconsistency in decision problems with one agent. What UDT agents do in multi-player games is still an open problem that we're working on. There was an extensive discussion about it in the previous threads if you want to see some of the issues involved. But the key ingredient that is missing is a theory of logical uncertainty, that tells us how different agents (or more generally, computational processes) are logically correlated to each other.

Comment author: RobinHanson 14 September 2009 12:00:36AM 2 points [-]

The ordinary time inconsistencies in game theory are all regarding multiple agents. Seems odd to suggest you've solved the problem except for those cases.

Comment author: Wei_Dai 14 September 2009 01:44:18AM 2 points [-]

I was referring to problems like Newcomb's Problem, Counterfactual Mugging, Sleeping Beauty, and Absentminded Driver.

Comment author: Eliezer_Yudkowsky 10 September 2009 06:33:46PM 1 point [-]

Not exactly the way I would phrase it, but Timeless Decision Theory and Updateless Decision Theory between them have already killed off a sufficiently large number of time inconsistencies that treating any remaining ones as a Problem seems well justified. Yes, we have solved all ordinary dynamic inconsistencies of conventional game theory already!

Comment author: RobinHanson 14 September 2009 12:04:08AM 2 points [-]

Let's take the simple case of time inconsistency regarding punishment. There is a two stage game with two players. First A decides if to cheat B for some gain. Then B decides if to punish A at some cost. Before the game B would like to commit to punishing A if A cheats, but once A has already cheated, B would rather not punish.

Comment author: Wei_Dai 14 September 2009 01:44:07AM *  1 point [-]

In UDT, we blame this time inconsistency on B's updating on A having cheated (i.e. treating it as a fact that can no longer be altered). Suppose it's common knowledge that A can simulate or accurately predict B, then B should reason that by deciding to punish, it increases the probability that A would have predicted that B would punish and thus decreases the probability that A would have cheated.

But the problem is not fully solved, because A could reason the same way, and decide to cheat no matter what it predicts that B does, in the expectation that B would predict this and see that it's pointless to punish.

So UDT seems to eliminate time-inconsistency, but at the cost of increasing the number of possible outcomes, essentially turning games with sequential moves into games with simultaneous moves, with the attendant increase in the number of Nash equilibria. We're trying to work out what to do about this.

Comment author: Benja 17 November 2012 11:43:41PM *  1 point [-]

So UDT seems to eliminate time-inconsistency, but at the cost of increasing the number of possible outcomes, essentially turning games with sequential moves into games with simultaneous moves, with the attendant increase in the number of Nash equilibria. We're trying to work out what to do about this.

Er, turning games with sequential moves into games with simultaneous moves is standard in game theory, and "never cheat, always punish cheating" and "always cheat, never punish" are what are considered the Nash equilibria of that game in standard parlance. [ETA: Well, "never cheat, punish x% of the time" will also be a NE for large enough x.] It is subgame perfect equilibrium that rules out "never cheat, always punish cheating" (the set of all SPE of a sequential game is a subset of the set of all NE of that game).

Comment author: Wei_Dai 18 November 2012 01:09:32AM 2 points [-]

Yeah, I used the wrong terminology in the grandparent comment. I guess the right way to put it is that SPE/backwards induction no longer seems reasonable under UDT and it's unclear what can take its place, as far as reducing the number of possible solutions to a given game.

Comment author: Manfred 18 November 2012 12:17:59AM *  1 point [-]

It is subgame perfect equilibrium that rules out "never cheat, always punish cheating" (the set of all SPE of a sequential game is a subset of the set of all NE of that game).

How strictly do you (or the standard approach) mean to rule out options that aren't good on all parts of the game? It seems like sometimes you do want to do things that are subgame suboptimal.

Edit: or at least be known to do things, which unfortunately can require actually being prepared to do the things.

Comment author: Benja 18 November 2012 11:32:28AM *  1 point [-]

Well, the classical game theorist would reply that they're studying one-off games, in which the game you're currently playing doesn't affect any payoff you get outside that game (otherwise that should be made part of the game), so you can't be doing the punishment because you want to be known to be a punisher, or the game that Robin specified doesn't model the situation you're in. The classical game theorist assumes you can't look into people's heads, so whatever you say or do before the cheating, you're always free to not punish during the punishment round (as you're undoubtedly aware, mutual checking of source code is prohibited by antitrust laws in over 185 countries).

The classical game theorist would further point out that if you do want model that punishment helps you be known as a punisher, then you should use their theory of repeated games, where they have some folk theorems for you saying that lots and lots of things can be Nash equilibria e.g. in a game where after each round there is a fixed probability of another round; for example, cooperation in the prisoner's dilemma, but also all sorts of suboptimal outcomes (which become Nash equilibria because any deviator gets punished as badly as the other players can punish them).

I should point out that not all classical game theorists think that SPE makes particularly good predictions, though; I've read someone say, I think Binmore, that you expect to virtually always see a NE in the laboratory after a learning period, but not an SPE, and that the original inventor of SPE actually came up with it as an example of what you would not expect to see in the lab, or something to that tune. (Sorry, I should really chase down that reference, but I don't have time right now. I'll try to remember to do that later. ETA: Ok, Binmore and Shaked, 2010: Experimental Economics: Where Next? Journal of Economic Behavior & Organization, 73: 87-100. See the stuff about backward induction, starting at the bottom on p.88. The inventor of SPE is Reinhard Selten, and the claim is that he didn't believe it would predict what you see it in the lab and "[i]t was to demonstrate this fact that he encouraged Werner Güth (...) to carry out the very first experiment on the Ultimatum game", not that he invented SPE for this purpose.)

Comment author: Manfred 18 November 2012 11:51:29PM *  1 point [-]

so whatever you say or do before the cheating, you're always free to not punish during the punishment round

Interesting. This idea, used as an argument for SPE, seems to be the free will debate intruding into decision theory. "Only some of these algorithms have freedom, and others don't, and humans are free, so they should behave like the free algorithms." This either ignores, or accepts, the fact that the "free" algorithms are just as deterministic as the "unfree" algorithms. (And it depends on other stuff, but that's not the fun bit)

(as you're undoubtedly aware, mutual checking of source code is prohibited by antitrust laws in over 185 countries).

:D

Comment author: Benja 25 November 2012 09:03:16PM 0 points [-]

Hm, I may not quite have gotten the point across: I think you may be thinking of the argument that humans have free will, so they can't force future versions of themselves to do something that would be against that future version's given its information, but that isn't the argument I was trying to explain. The idea I was refering to works precisely the same way with deterministic algorithms, as long as the players only get to observe each others' actions, not each others' source (though of course its proponents don't think in those terms). The point is that if the other player looks at you severely and suggestively taps their baseball bat and tells you about how they've beaten up people who have defected in the past, that still doesn't mean that they're actually going to beat you up -- since if such threats were effective on you, then making them would be the smart thing to do even if the other player has no intention of actually beating you up (and risk going to jail) if for some reason you end up defecting. (Compare AI-in-the-box...) (Of course, this argument only works if you're reasonably sure that the other player is a classical game theorist; if you think you might be playing against someone who will, "irrationally", actually punish you, like a timeless decision theorist, then you should not defect, and they won't have to punish you...)

Now, if you had actual information about what this player had done in similar situations in the past, like police reports of beaten-up defectors, this argument wouldn't work, but then (the standard argument continues) you have the wrong game-theoretical model; the correct model includes all of the punisher's previous interactions, and in that game, it might well be a SPE to punish. (Though only if the exact number of "rounds" is not certain, for the same reason as in the finitely iterated Prisoner's Dilemma: in the last round the punisher has no more reason to punish because there are no future targets to impress, so you defect no matter what they did in previous rounds, so they have no reason to punish in the second-to-last round, etc.)

(BTW: reference added to grandparent.)

Comment author: Eliezer_Yudkowsky 14 September 2009 06:15:24PM *  0 points [-]

(The difficulty arises if UDT B reasons logically that there should not logically exist any copies of its current decision process finding themselves in worlds where A is dependent on its own decision process, and yet A defects. I'm starting to think that this resembles the problem I talked about earlier, where you have to use Omega's probability distribution in order to agree to be Counterfactually Mugged on problems that Omega expects to have a high payoff. Namely, you may have to use A's logical uncertainty, rather than your own logical uncertainty, in order to perceive a copy of yourself inside A's counterfactual. This is a complicated issue and I may have to post about it in order to explain it properly.)

Comment author: Eliezer_Yudkowsky 14 September 2009 01:45:03AM 0 points [-]

Drescher-Nesov-Dai UDT solves this (that is, goes ahead and punishes the cheater, making the same decision at both times).

TDT can handle Parfit's Hitchhiker - pay for the ride, make the same decision at both times, because it forms the counterfactual "If I did not pay, I would not have gotten the ride". But TDT has difficulty with this particular case, since it implies that B's original belief that A would not cheat if punished, was wrong; and after updating on this new information, B may no longer have a motive to punish. (UDT of course does not update.) Since B's payoff can depend on B's complete strategy tree including decisions that would be made under other conditions, instead of just depending on the actual decision made under real conditions, this scenario is outside the realm where TDT is guaranteed to maximize.

Comment author: CarlShulman 14 September 2009 01:08:36AM *  0 points [-]

The case is underspecified:

  • How transparent/translucent are the agents? I.e. can A examine B's sourcecode, or use observational and other data to assess B's decision procedure? If not, what is A's prior probability distribution for decision procedures B might be using?
  • Are both A and B using the same decision theory, TDT/UDT? Or is A using CDT and B using TDT/UDT or vice versa?
Comment author: Eliezer_Yudkowsky 14 September 2009 01:50:15AM 1 point [-]

Clearly B has mistaken beliefs about either A or its own dispositions; otherwise B would not have dealt with A in the interaction where A ended up cheating. If B uses UDT (and hence will carry through punishments), and A uses any DT that correctly forecasts B's response to cheating, then A should not in fact cheat. If A cheats anyway, though, B still punishes.

Actually, on further reflection, it's possible that B would reason that it is logically impossible for A to have the specified dependency on B's decision, and yet for A to still end up defecting, in which case even UDT might end up in trouble - it would be a transparent logical impossibility for A to defect if B's beliefs about A are true, so it's not clear that B would handle the event correctly. I'll have to think about this.

Comment author: Vladimir_Nesov 14 September 2009 07:01:19AM *  0 points [-]

If there is some probability of A cheating even if B precommits to punishment, but with odds in B's favor, the situation where B needs to implement punishment is quite possible (expected). Likewise, if B precommiting to punish A is predicted to lead to an even worse outcome than not punishing (because of punishment expenses), UDT B won't punish A. Futhermore, a probability of cheating and not-punishment of cheating (mixed strategies, possibly on logical uncertainty to defy the laws of the game if pure strategies are required) is a mechanism through which the players can (consensually) bargain with each other in the resulting parallel game, an issue Wei Dai mentioned in the other reply. B doesn't need absolute certainty at any stage, in both cases.

Also, in UDT there are no logical certainties, as it doesn't update on logical conclusions as well.

Comment author: Eliezer_Yudkowsky 14 September 2009 06:12:07PM 0 points [-]

If there is some probability of A cheating even if B precommits to punishment

Sure, but that's the convenient setup. What if for A to cheat means that you necessarily just mistaken about which algorithm A runs?

Also, in UDT there are no logical certainties, as it doesn't update on logical conclusions as well.

UDT will be logically certain about some things but not others. If UDT B "doesn't update" on its computation about what A will do in response to B, it's going to be in trouble.

Comment author: Vladimir_Nesov 20 September 2009 02:09:50PM *  0 points [-]

What if for A to cheat means that you necessarily just mistaken about which algorithm A runs?

A decision algorithm should never be mistaken, only uncertain.

UDT will be logically certain about some things but not others. If UDT B "doesn't update" on its computation about what A will do in response to B, it's going to be in trouble.

"Doesn't update" doesn't mean that it doesn't use the info (but you know that, so what do you mean?). A logical conclusion can be a parameter in a strategy, without making the algorithm unable to reason about what it would be like if the conclusion was different, that is basically about uncertainty of same algorithm in other states of knowledge.

Comment author: Alicorn 14 September 2009 12:12:54AM 0 points [-]

Am I correct in assuming that if A cheats and is punished, A suffers a net loss?

Comment author: Johnicholas 14 September 2009 12:43:24AM 0 points [-]

Yes.

Comment author: Wei_Dai 10 September 2009 07:06:30PM 0 points [-]

What is the remaining Problem that you're referring to? Why can't we apply the formalism of UDT1 to the various examples people seem to be puzzled about and just get the answers out? Or is cousin_it right about the focus having shifted to how human beings ought to reason about these problems?

Comment author: Eliezer_Yudkowsky 12 September 2009 08:03:45PM 2 points [-]

The anthropic problem was a remaining problem for TDT, although not UDT.

UDT has its own problems, possibly. For example, in the Counterfactual Mugging, it seems that you want to be counterfactually mugged whenever Omega has a well-calibrated distribution and has a systematic policy of offering high-payoff CMs according to that distribution, even if your own prior has a different distribution. In other words, the key to the CM isn't your own distribution, it's Omega's. And it's not possible to interpret UDT as epistemic advice, which leaves anthropic questions open. So I haven't yet shifted to UDT outright.

(The reason I did not answer your question earlier was that it seemed to require a response at greater length than the above.)

Comment author: Wei_Dai 26 September 2009 09:47:04AM 2 points [-]

Hi, this is the 2-week reminder that you haven't posted your longer response yet. :)

Comment author: Wei_Dai 14 September 2009 02:05:56AM 1 point [-]

Well, you're right in the sense that I can't understand the example you gave. (I waited a couple of days to see if it would become clear, but it didn't) But the rest of the response is helpful.

Comment author: Benja 03 December 2012 08:00:07PM *  1 point [-]

Did he ever get around to explaining this in more detail? I don't remember reading a reply to this, but I think I've just figured out the idea: Suppose you get word that Omega is coming to the neighbourhood and going to offer counterfactual muggings. What sort of algorithm do you want to self-modify into? You don't know what CMs Omega is going to offer; all you know is that it will offer odds according to its well-calibrated prior. Thus, it has higher expected utility to be a CM-accepter than a CM-rejecter, and even a CDT agent would want to self-modify.

I don't think that's a problem for UDT, though. What UDT will compute when asked to pay is the expected utility under its prior of paying up when Omega asks it to; thus, the condition for UDT to pay up is NOT

prior probability of heads * Omega's offered payoff > prior of tails * Omega's price

but

prior of (heads and Omega offers a CM for this coin) * payoff > prior of (tails and CM) * price.

In other words, UDT takes the quality of Omega's predictions into account and acts as if updating on them (the same way you would update if Omega told you who it expects to win the next election, at 98% probability).

CDT agents, as usual, will actually want to self-modify into a UDT agent whose prior equals the CDT agent's posterior [ETA: wait, sorry, no, they won't act as if they can acausally control other instances of the same program, but they will self-modify so as to make future instances of themselves (which obviously they control causally) act in a way that maximizes EU according to the agent's present posterior, and that's what we need here], and will use the second formula above accordingly -- they don't want to be a general CM-rejecter, but they think that they can do even better than being a general CM-accepter if they refuse to pay up if at the time of self-modification they assigned low probability to tails, even conditional on Omega offering them a CM.

Comment author: Wei_Dai 04 December 2012 01:52:16PM 0 points [-]

He never explained further, and actually I still don't quite understand the example even given your explanation. Maybe you can reply directly to Eliezer's comment so he can see it in his inbox, and let us know if he still thinks it's a problem for UDT?

Comment author: Vladimir_Nesov 10 September 2009 04:52:03PM 0 points [-]

But the key ingredient that is missing is a theory of logical uncertainty, that tells us how different agents (or more generally, computational processes) are logically correlated to each other.

I'd look for it as logical theory of concurrency and interaction: "uncertainty" fuzzifies the question.

Comment author: Wei_Dai 11 September 2009 07:10:30PM *  0 points [-]

I'd look for it as logical theory of concurrency and interaction: "uncertainty" fuzzifies the question.

Why? For me, how different agents are logically correlated to each other seems to be the same type of question as "what probability (if any) should I assign to P!=NP?" Wouldn't the answer fall out of a general theory of logical uncertainty? (ETA: Or at least be illuminated by such a theory?)

Comment author: Vladimir_Nesov 11 September 2009 08:43:48PM *  1 point [-]

Logic is already in some sense about uncertainty (e.g. you could interpret predicates as states of knowledge). When you add one more "uncertainty" of some breed, it leads to perversion of logic, usually of applied character and barren meaning.

The concept of "probability" is suspect, I don't expect it to have foundational significance.

Comment author: Wei_Dai 11 September 2009 09:12:25PM 1 point [-]

So what would you call a field that deals with how one ought to make bets involving P!=NP (i.e., mathematical statements that we can't prove to be true or false), if not "logical uncertainty"? Just "logic"? Wouldn't that cause confusion in others, since today it's usually understood that such questions are outside the realm of logic?

Comment author: Vladimir_Nesov 11 September 2009 09:19:34PM 0 points [-]

I don't understand how to make such bets, except in a way it's one of the kinds of human decision-making that can be explicated in terms of priors and utilities. The logic of this problem is in the process that works with the statement, which is in the domain of proof theory.

Comment author: cousin_it 10 September 2009 07:02:41AM *  3 points [-]

I waited to comment on this, to see what others would say. Right now Psy-Kosh seems to be right about anthropics; Wei Dai seems to be right about UDT; timtyler seems to be right about Boltzmann brains; byrnema seems to be mostly right about pointers; but I don't understand why nobody latched on to the "reflective consistency" part. Surely the kind of consistency under observer-splitting that you describe is too strong a requirement in general: if two copies of you play a game, the correct behavior for both of them would be to try to win, regardless of what overall outcome you'd prefer before the copying. The paperclip formulation works around this problem, so the correct way to analyze this would be in terms of multiplayer game theory with chance moves, as Psy-Kosh outlined.

Comment author: Wei_Dai 10 September 2009 07:28:25AM 3 points [-]

if two copies of you play a game, the correct behavior for both of them would be to try to win, regardless of what overall outcome you'd prefer before the copying

That doesn't make sense to me, unless you're assuming that the player isn't capable of self-modification. If it was, wouldn't it modify itself so that its copies won't try to win individually, but cooperate to obtain the outcome that it prefers before the copying?

Comment author: cousin_it 10 September 2009 08:24:15AM *  2 points [-]

Yes, that's right. I've shifted focus from correct program behavior to correct human behavior, because that's what everyone else here seems to be talking about. If the problem is about programs, there's no room for all this confusion in the first place. Just specify the inputs, outputs and goal function, then work out the optimal algorithm.

Comment author: [deleted] 07 March 2014 01:24:19PM *  1 point [-]

The notion of "I am a bolzmann brain" goes away when you conclude that conscious experience is a Tegmark-4 thing, and that equivalent conscious experiences are mathematically equal and therefore there is no difference and you are at the same time a human being and a bolzmann brain, at least until they diverge.

Thus, antrhopic reasoning is right out.

Comment author: Kawoomba 07 March 2014 06:40:17PM 0 points [-]

Well, by the same token "What I experience represents what I think it does / I am not a Boltzmann brain which may dwindle out of existence in an instance" would go right out, just the same. This kind of reasoning reduces to something similar to quantum suicide. The point at which your conscious experience is expected to diverge, even if you take that perspective, does kind of matter. The different paths and their probabilistic weights which govern the divergence alter your expected experience, after all. Or am I misunderstanding?

Comment author: [deleted] 10 March 2014 09:09:07PM 0 points [-]

I am not sure.

Let met try to clarify.

By virtue of existential quantification in a ZF equivalent set theory, we can have anything.

In an arbitrary encoding format, I now by existential quantfication select a set which is the momentary subjective experience of being me as I write this post, e.g. memory sensations, existential sensations, sensory input, etc.

It is a mathematical object. I can choose it's representation format independent of any computational medium I might use to implement it.

I just so happens that there is a brain in the universe we are in, which is implementing this matematical object.

Brains are computers that compute conscious experiences.

They no more have bearing on the mathematical objects they implement than a modern computer has on the definition of conways game of life.

Does that clarify it?

Comment author: Kawoomba 10 March 2014 09:30:27PM -1 points [-]

I just so happens that there is a brain in the universe we are in, which is implementing this matematical object.

Which is why we're still highly invested in the question whether (whatever it is that generates our conscious experience) will "stay around" and continue with our pattern in an expected manner.

Let's say we identify with only the mathematical object, not the representation format at all. That doesn't excuse us from anthropic reasoning, or from a personal investment in reasoning about the implementing "hardware". We'd still be highly invested in the question, even as 'mathematical objects'. We probably still care about being continually instantiated.

The shift in perspective you suggest doesn't take away from that (and adds what could be construed as a flavor of dualism).

Comment author: [deleted] 11 March 2014 01:58:46PM 0 points [-]

Hmmm.

I will have to mull on that, but let me leave with a mote of explanation:

The reasoning strategy I used to arrive at this conclusion was similar to the one used in concluding that "every possible human exists in paralell universes, so we need not make more humans, but more humans feeling good."

Comment author: Jiro 11 March 2014 02:53:03PM 1 point [-]

Doesn't every possible human-feeling-good also exist in parallel universes?

(And if you argue that although they exist you can increase their measure, that applies to the every-possible-human version as well.)

Comment author: [deleted] 11 March 2014 08:36:07PM *  0 points [-]

Sure, but I will quote Karkat Vantas on time-travel shenanigans from Andrew Hussie's Homestuck

CCG: EVERYBODY, DID YOU HEAR THAT?? SUPERFUTURE VRISKA HAS AN IMPORTANT LIFE LESSON FOR US ALL.
CCG: WE DON'T HAVE TO WORRY ABOUT OUR PRESENT RESPONSIBILIES AND OBLIGATIONS!
CCG: BECAUSE AS IT TURNS OUT, IN THE FUTURE ALL THAT STUFF ALREADY HAPPENED. WE'RE OFF THE FUCKING HOOK!

Comment author: gelisam 09 September 2009 04:01:12AM 1 point [-]

The reason we shouldn't update on the "room color" evidence has nothing to do with the fact that it constitutes anthropic evidence. The reason we shouldn't update is that we're told, albeit indirectly, that we shouldn't update (because if we do then some of our copies will update differently and we will be penalized for our disagreement).

In the real world, there is no incentive for all the copies of ourselves in all universes to agree, so it's all right to update on anthropic evidence.

Comment author: twanvl 08 September 2009 10:28:24PM 1 point [-]

Before the experiment, you calculate the general utility of the conditional strategy "Reply 'Yes' to the question if you wake up in a green room" as (50% * ((18 * +$1) + (2 * -$3))) + (50% * ((18 * -$3) + (2 * +$1))) = -$20

This assumes that the question is asked only once, but then, to which of the 20 copies will it be asked?

If all 20 copies get asked the same question (or equivalently if a single copy chosen at random is) then the utility is (50% * 18/20 * ((18 * +$1) + (2 * -$3))) + (50% * 2/20 * ((18 * -$3) + (2 * +$1))) = 2.8$ = 50% * 5.6$.


Consider the similar thought experiments:

  • I flip a fair coin to determine whether to switch to my headdy coin or my tailly coin, which have a 90% and 10% probability of heads respectively.
  • Now I flip this biased coin. If it comes up heads then I paint the room green, if it comes up tails I paint it red.
  • You then find yourself in a green room.
  • Then I flip the biased coin again, and repaint the room.
  • Before this second flip, I offer you the bet of +1$ if the room stays green and -3$ if it becomes red.

The prior expected utility before the experiment is:

E(util|headdy) = 90% * 1$ + 10% * -3$ = 0.6$
E(util|tailly) = 10% * 1$ + 90% * -3$ = -2.6$
E(util) = 50% * E(util|headdy) + 50% * E(util|tailly) = -1$

Given that you find yourself in a green room after the first flip, you can determine the probability that the headdy coin is used:

P(green) = 0.5
P(green|headdy) = 0.9
P(headdy|green) = 0.9

Which gives a posterior utility:

E(util|green) = 0.9 * E(util|headdy) + 0.1 * E(util|tailly) = 0.28$
Comment author: DanArmak 08 September 2009 10:41:50PM 3 points [-]

This assumes that the question is asked only once, but then, to which of the 20 copies will it be asked?

Every copy that is in a green room is asked the question (so either 2 or 18 copies total are asked). If all answer Play, we play. If all answer Don't Play, we don't. In any other case we fine all 20 copies some huge amount; this is intended to make them agree beforehand on what answer to give. (This is reworded from the OP.)

For your other thought experiment - if there aren't actual N copies being asked the question, then there's no dilemma; you (the only copy) simply update on the evidence available (that the room is green). So yes, the original problem requires copies being asked in parallel to introduce the possibility that you're hurting other copies of yourself by giving a self-serving answer. Whereas if you're the only copy, you always give a self-serving answer, i.e. play only if the room is green.

Comment author: wedrifid 09 September 2009 04:59:00PM 1 point [-]

Timeless decision agents reply as if controlling all similar decision processes, including all copies of themselves. Classical causal decision agents, to reply "Yes" as a group, will need to somehow work out that other copies of themselves reply "Yes", and then reply "Yes" themselves. We can try to help out the causal decision agents on their coordination problem by supplying rules such as "If conflicting answers are delivered, everyone loses $50". If causal decision agents can win on the problem "If everyone says 'Yes' you all get $10, if everyone says 'No' you all lose $5, if there are conflicting answers you all lose $50" then they can presumably handle this. If not, then ultimately, I decline to be responsible for the stupidity of causal decision agents.

The coordination hack to work around some of the stupidity of causal decision agents doesn't appear to be necessary here.

"Somehow working out that the other copies of themselves reply 'yes'" should be trivial for an agent focussed on causality when the copies are identical, have no incentive to randomise and have identical inputs. If the payoff for others disagreeing is identical to the payoff for 'no' they can be ignored. The conflict penalty makes the coordination problem more difficult for the causal agent in this context, not less.

Comment author: byrnema 09 September 2009 01:36:00AM *  1 point [-]

[comment deleted]

Oops... my usual mistake of equivocating different things and evolving the problem until it barely resembles the original. I will update my "solution" later if it still works for the original.

... Sigh. Won't work. My previous "solution" recovered the correct answer of -20 because I bent the rules enough to have each of my green-room-deciders make a global rather than anthropic calculation.

Comment author: byrnema 09 September 2009 04:56:26PM *  2 points [-]

Thinking about how all the green-room people come to the wrong conclusion makes my brain hurt. But I suppose, finally, it is true. They cannot base their decision on their subjective experience, and here I'll outline some thoughts I've had as to under what conditions they should know they cannot do so.

Suppose there are 20 people (Amy, Benny, Carrie, Donny, ...) and this experiment is done as described. If we always ask Tony (the 20th person) whether or not to say "yes", and he bases his decision on whether or not he is in a green room, then the expected value of his decision really is $5.6. Tony here is a special, singled out "decider". One way of looking at this situation is that the 'yes' depends on some information in the system (that is, whether or not Tony was in a green room.)

If instead we say that the decider can be anyone, and in fact we choose the decider after the assortment into rooms as someone in a green room, then we are not really given any information about the system.

It is the difference between (a) picking a person, and seeing if they wake up in a green room, and (b) picking a person that is in a green room. (I know you are well aware of this difference, but it helps to spell it out.)

You can't pick the deciders from a set with a prespecified outcome. It's a pointer problem: You can learn about the system from the change of state from Tony to Tony* (Tony: no room -->Tony: green room), but you can't assign the star after the assignment (pick someone in a green room and ask them).

When a person wakes in a green room and is asked, they should say 'yes' if they are randomly chosen to be asked independently of their room color. If they were chosen after the assignment, because they awoke in a green room, they should recognize this as the “unfixed pointer problem” (a special kind of selection bias).

Avoiding the pointer problem is straight-forward. The people who wake in red rooms have a posterior probability of heads as 10%. The people who wake in green rooms have a posterior probability of heads as 90%. Your posterior probability is meaningful only if your posterior probability could have been either way. Since Eliezer only asks people who woke in green rooms, and never asks people who woke in red rooms, the posterior probabilities are not meaningful.

Comment author: Eliezer_Yudkowsky 09 September 2009 06:26:54PM *  0 points [-]

The people who wake in red rooms have a posterior probability of heads as 10%. The people who wake in green rooms have a posterior probability of heads as 90%. Your posterior probability is meaningful only if your posterior probability could have been either way. Since Eliezer only asks people who woke in green rooms, and never asks people who woke in red rooms, the posterior probabilities are not meaningful.

The rest of your reply makes sense to me, but can I ask you to amplify on this? Maybe I'm being naive, but to me, a 90% probability is a 90% probability and I use it in all my strategic choices. At least that's what I started out thinking.

Now you've just shown that a decision process won't want to strategically condition on this "90% probability", because it always ends up as "90% probability" regardless of the true state of affairs, and so is not strategically informative to green agents - even if the probability seems well-calibrated in the sense that, looking over impossible possible worlds, green agents who say "90%" are correct 9 times out of 10. This seems like a conflict between an anthropic sense of probability (relative frequency in a population of observers) and a strategic sense of probability (summarizing information that is to be used to make decisions), or something along those lines. Is this where you're pointing toward by saying that a posterior probability is meaningful at some times but not others?

Comment author: byrnema 09 September 2009 08:28:43PM *  3 points [-]

a decision process won't want to strategically condition on this "90% probability", because it always ends up as "90% probability" regardless of the true state of affairs, and so is not strategically informative to green agents

The 90% probability is generally strategically informative to green agents. They may legitimately point to themselves for information about the world, but in this specific case, there is confusion about who is doing the pointing.

When you think about a problem anthropically, you yourself are the pointer (the thing you are observing before and after to make an observation) and you assign yourself as the pointer. This is going to be strategically sound in all cases in which you don't change as the pointer before and after an observation. (A pretty normal condition. Exceptions would be experiments in which you try to determine the probability that a certain activity is fatal to yourself -- you will never be able to figure out the probability that you will die of your shrimp allergy by repeated trials of consuming shrimp, as it will become increasingly skewed towards lower and lower values.)

Likewise, if I am in the experiment described in the post and I awaken in a green room I should answer "yes" to your question if I determine that you asked me randomly. That is, that you would have asked me even if I woke in a red room. In which case my anthropic observation that there is a 90% probability that heads was flipped is quite sound, as usual.

On the other hand, if you ask me only if I wake in a green room, then you wouldn’t have asked “me” if I awoke in a red room. (So I must realize this isn’t really about me assigning myself as a pointer, because “me” doesn’t change depending on what room I wake up in.) It's strange and requires some mental gymnastics for me to understand that you Eliezer are picking the pointer in this case, even though you are asking me about my anthropic observation, for which I would usually expect to assign myself as the pointer.

So for me this is a pointer/biased-observation problem. But the anthropic problem is related, because we as humans cannot ask about the probability of currently observed events based on the frequency of observations which, had they been otherwise, would not have permitted ourselves to ask the question.

Comment author: Eliezer_Yudkowsky 09 September 2009 08:49:33PM 1 point [-]

On the other hand, if you ask me only if I wake in a green room, then you wouldn’t have asked “me” if I awoke in a red room. (So I must realize this isn’t really about me assigning myself as a pointer, because “me” doesn’t change depending on what room I wake up in.)

Huh. Very interesting again. So in other words, the probability that I would use for myself, is not the probability that I should be using to answer questions from this decision process, because the decision process is using a different kind of pointer than my me-ness?

How would one formalize this? Bostrom's division-of-responsibility principle?

Comment author: byrnema 09 September 2009 09:15:21PM *  1 point [-]

I haven't had time to read this, but it looks possibly relevant (it talks about the importance of whether an observation point is fixed in advance or not) and also possibly interesting, as it compares Bayesian and frequentist views.

I will read it when I have time later... or anyone else is welcome to if they have time/interest.

Comment author: byrnema 11 September 2009 09:58:31AM *  0 points [-]

What I got out of the article above, since I skipped all the technical math, was that frequentists consider "the pointer problem" (i.e., just your usual selection bias) as something that needs correction while Bayesians don't correct in these cases. The author concludes (I trust, via some kind of argument) that Bayesian's don't need to correct if they choose the posteriors carefully enough.

I now see that I was being entirely consistent with my role as the resident frequentist when I identified this as a "pointer problem" problem (which it is) but that doesn't mean the problem can't be pushed through without correction* -- the Bayesian way -- by carefully considering the priors.

*"Requiring correction" then might be a euphemism for time-dependent, while a preference for an updateless decision theory is a good Bayesian quality. A quality, by the way, a frequentist can appreciate as well, so this might be a point of contact on which to win frequentists over.

Comment author: James_Miller 08 September 2009 09:30:29PM -2 points [-]

Is there any version of this post that doesn't involve technologies that we don't have? If not, then might the resolution to this paradox be that the copying technology assumed to exist can't exist because if it did it would give rise to a logical inconsistency.

Comment author: Johnicholas 08 September 2009 11:38:28PM 2 points [-]

Cute.

You may be able to translate into the language of "wake, query, induce amnesia" - many copies would correspond to many wakings.

Comment author: DanArmak 08 September 2009 11:51:38PM 1 point [-]

No, the dilemma depends on having many copies. You're trying to optimize the outcome averaged over all copies (before the copies are made), because you don't know which copy "you" will "be".

In the no-copies / amnesia version, the updateless approach is clearly correct. You have no data to update on - awakening in a green room tells you nothing about the coin tosses because either way you'd wake up in a green room at least once (and you forget about it, so you don't know how many times it happened). Therefore you will always refuse to play.

Comment author: James_Miller 09 September 2009 12:06:22AM -2 points [-]

But we don't have the type of amnesia drugs required to manifest the Sleeping Beauty problem, and perhaps there is something about consciousness that would prevent them from ever being created. (Isn't there some law of physics that precludes the total destruction of information.)

Comment author: Johnicholas 09 September 2009 02:25:16AM *  2 points [-]

I don't understand - what type of amnesia drug is required? For example, this lab:

http://memory.psy.cmu.edu/

apparently routinely does experiments induce temporary amnesia using a drug called midalozam. In general, I was under the impression that a wide variety of drugs have side effects of various degrees and kinds of amnesia, including both anterograde and retrograde.

Your proposal that consciousness might be conserved, and moreover that this might be proved by armchair reasoning seems a bit farfetched. Are you:

  1. just speculating idly?
  2. seriously pursuing this hypothesis as the best avenue towards resolving EY's puzzle?
  3. pursuing some crypto-religious (i.e. "consciousness conserved"=>"eternal life") agenda?
Comment author: James_Miller 09 September 2009 04:37:43AM 0 points [-]

My first comment was (2) the second (1).

If DanArmk's comment is correct then it isn't important for my original comment whether there exists amnesia drugs.

If your post is correct then my second comment is incorrect.

Comment author: timtyler 09 September 2009 09:20:27AM -2 points [-]

Microscopic reversibility prohibits any destruction of the information necessary to run things backwards - and that's all the information in the universe as far as we know.

Comment author: mamert 16 May 2016 10:20:27AM *  0 points [-]

I keep having trouble thinking of probabilities when I'm to be copied and >=1 of "me" will see red and >=1 of "me" will see green. My thought is that it is 100% likely that "I" will see red and know there are others, once-mes, who see green, and 100% likely vice-versa. Waking up to see red (green) is exactly the expected result.

I do not know what to make of this opinion of mine. It's as if my definition of self - or choice of body - is in superposition. Am I committing an error here? Suggestions for further reading would be appreciated.

Comment author: Angela 06 August 2014 10:22:24PM 0 points [-]

I remain convinced that the probability is 90%.

The confusion is over whether you want to maximize the expectation of the number of utilons there will be if you wake up in a green room or the expectation of the number of utilons you will observe if you wake up in a green room.

Comment author: byrnema 11 September 2009 08:57:59AM *  0 points [-]

Whoohoo! I just figured out the correct way to handle this problem, that renders the global and egocentric/internal reflections consistent.

We will see if my solution makes sense in the morning, but the upshot is that there was/is nothing wrong with the green roomer's posterior, as many people have been correctly defending. The green roomer who computed an EV of $5.60 modeled the money pay-off scheme wrong.

In the incorrect calculation that yields $5.6 EV, the green roomer models himself as winning (getting the favorable +$12) when he is right and losing (paying the -$52) when he is wrong. But no, not exactly. The green roomer doesn't win every time he's right -- even though certainly he's right every time he's right.

The green roomer wins 1 out of every 18 times that he's right, because 17 copies of himself that were also right do not get their own independent winnings, and he loses 1 out of every 2 times he's wrong, because there are 2 of him that are wrong in the room that pays $52.

So it is Bostrom's division-of-responsibility, with the justification. It is probably more apt to name it division-of-reward.

Here's is the correct green roomer calculation:

EV = P(heads)(payoff given heads)(rate of payoff given heads)+ P(tails)(payoff given tails)(rate of payoff given tails)

=.9($12)(1/18)+.1(-$52)(1/2) = -2

(By the way, this doesn't modify what I said about pointers, but I must admit I don't understand at the moment how the two perspectives are related. Yet; some thoughts.)

Comment author: byrnema 11 September 2009 05:43:26PM *  3 points [-]

This is my attempt at a pedagogical exposition of “the solution”. It’s overly long, and I've lost perspective completely about what is understood by the group here and what isn't. But since I've written up this solution for myself, I'll go ahead and share it.

The cases I'm describing below are altered from the OP so that they completely non-metaphysical, in the sense that you could implement them in real life with real people. Thus there is an objective reality regarding whether money is collectively lost or won, so there is finally no ambiguity about what the correct calculation actually is.

Suppose that there are twenty different graduate students {Amy, Betty, Cindy, ..., Tony} and two hotels connected by a breezeway. Hotel Green has 18 green rooms and 2 red rooms. Hotel Red has 18 red rooms and 2 green rooms. Every night for many years, students will be assigned a room in either Hotel Green or Hotel Red depending on a coin flip (heads --> Hotel Green for the night, tails --> Hotel Red for the night). Students won’t know what hotel they are in but can see their own room color only. If a student sees a green room, that student correctly deduces they are in Hotel Green with 90% probability.

Case 1: Suppose that every morning, Tony is allowed to bet that he is in a green room. If he bets ‘yes’ and is correct, he pockets $12. If he bets ‘yes’ and is wrong, he has to pay $52. (In other words, his payoff for a correct vote is $12, the payoff for a wrong vote is -$52.) What is the expected value of his betting if he always says ‘yes’ if he is in a green room?

For every 20 times that Tony says ‘yes’, he wins 18 times (wins $12x18) and he loses twice (loses $52x2), consistent with his posterior. One average he wins $5.60 per bet , or $2.80 per night. (He says “yes” to the bet 1 out of every 2 nights, because that is the frequency with which he finds himself in a green room.) This is a steady money pump in the student’s favor.

The correct calculation for Case 1 is:

average payoff per bet = (probability of being right)x(payoff if right)+ (probability of being wrong)x(payoff if wrong) = .9x18+.1x-52 =5.6.

Case 2: Suppose that Tony doesn’t pocket the money, but instead the money is placed in a tip jar in the breezeway. Tony’s betting contributes $2.80 per night on average to the tip jar.

Case 3: Suppose there is nothing special about Tony, and all the students get to make bets. They will all make bets when they wake in green rooms, and add $2.80 per night to the tip jar on average. Collectively, the students add $56 per night to the tip jar on average. (If you think about it a minute, you will see that they add $216 to the tip jar on nights that they are assigned to hotel Green and lose $104 on nights that they are assigned to hotel Red.) If the money is distributed back to the students, they each are making $2.80 per night, the same steady money pump in their favor that Tony took advantage of in Case 1.

Case 4: Now consider the case described in the OP. We already understand that the students will vote “yes” if they wake in a green room and that they expect to make money doing so. Now the rules are going to change, however, so that when all the green roomers unanimously vote “yes”, $12 are added to the tip jar if they are correct and $52 are subtracted if they are wrong. Since the students are assigned to Hotel Green half the time and to Hotel Red half the time, on average the tip jar loses $20 every night. Suddenly, the students are losing $1 a night!

Each time a student votes correctly, it is because they are all in Hotel Green, as per the initial set up of the problem in the OP. So all 18 green roomer votes are correct and collectively earn $12 for that night. The payoff is $12/18 per correct vote. Likewise, the payoff per wrong vote is -$52/2.

So the correct calculation for case 4 is as follows:

average payoff per bet = (probability of being right)x(payoff if right)+ (probability of being wrong)x(payoff if wrong) = .9x(18/12)+.1x(-52/2) = -2.

So in conclusion, in the OP problem, the green roomer must recognize that he is dealing with case #4 and not Case #1, in which the payoff is different (but not the posterior).

Comment author: mendel 22 May 2011 09:15:09AM 0 points [-]

I believe both of your computations are correct, and the fallacy lies in mixing up the payoff for the group with the payoff for the individual - which the frame of the problem as posed does suggest, with multiple identities that are actually the same person. More precisely, the probabilities for the individual are 90/10 , but the probabilities for the groups are 50/50, and if you compute payoffs for the group (+$12/-$52), you need to use the group probabilities. (It would be different if the narrator ("I") offered the guinea pig ("you") the $12/$52 odds individually.)

byrnema looked at the result from the group viewpoint; you get the same result when you approach it from the individual viewpoint, if done correctly, as follows:

For a single person, the correct payoff is not $12 vs. -$52, but rather ($1 minus $6/18 to reimburse the reds, making $0.67) * 90% and ($1 minus $54/2 = -$26) * 10%, so each of the copies of the guinea pig is going to be out of pocket by 2/3* 0.9 + (-26) * 0.1 = 0.6 - 2.6 = -2, on average.

The fallacy of Eliezer's guinea pigs is that each of them thinks they get the $18 each time, which means that the 18 goes into his computation twice (squared) for their winnings (18 * 18/20). This is not a problem with antropic reasoning, but with statistics.

A distrustful individual would ask themselves, "what is the narrator getting out of it", and realize that the narrator will see the -$12 / + $52 outcome, not the guinea pig - and that to the narrator, the 50/50 probability applies. Don't mix them up!

Comment author: byrnema 11 September 2009 09:13:39AM 0 points [-]

It was 3:30 in the morning just a short while ago, and I woke up with a bunch of non-sensical ideas about the properties of this problem, and then while I was trying to get back to sleep I realized that one of the ideas made sense. Evidence that understanding this problem for myself required a right-brain reboot.

I'm not surprised about the reboot: I've been thinking about this problem a lot, which signals to my brain that it's important, and it literally hurt my brain to think about why the green roomers were losing for the group when they thought they were winning, strongly suggesting I was hitting my apologist limit.

Comment author: Emile 09 September 2009 08:23:15AM 0 points [-]

Let the dilemma be, "I will ask all people who wake up in green rooms if they are willing to take the bet 'Create 1 paperclip if the logical coinflip came up heads, destroy 3 paperclips if the logical coinflip came up tails'. (Should they disagree on their answers, I will destroy 5 paperclips.)" Then a paperclip maximizer, before the experiment, wants the paperclip maximizers who wake up in green rooms to refuse the bet. But a conscious paperclip maximizer who updates on anthropic evidence, who wakes up in a green room, will want to take the bet, with expected utility ((90% * +1 paperclip) + (10% * -3 paperclips)) = +0.6 paperclips.

That last calculation doesn't look right to me : the paperclip maximizer in the green room still knows that there are other paperclip maximizers in red rooms who will refuse the bet whether or not they rely on anthropic evidence. So the expected utility of taking the bet would be 100% * - 5 paperclips.

Or did I misunderstand something?

Comment author: wedrifid 09 September 2009 05:00:10PM *  3 points [-]

Or did I misunderstand something?

Red Clippy doesn't get a vote.

Comment author: PlaidX 09 September 2009 06:59:51AM 0 points [-]

Can someone come up with a situation of the same general form as this one where anthropic reasoning results in optimal actions and nonanthropic reasoning results in suboptimal actions?

Comment author: PlaidX 09 September 2009 07:13:25AM *  8 points [-]

How about if the wager is that anybody in any room can guess the outcome of the coinflip, and if they get it right they win 1$ and if they get it wrong they lose 2$?

If you still think it's 50% after waking up in a green room, you won't take the bet, and you'll win 0$, if you think it's 90% you'll take the bet and come out 14$ ahead on balance, with two of you losing 2$ each and 18 of you getting $1.

Doesn't this show anthropic reasoning is right as much as the OP shows it's wrong?

Comment author: Dagon 08 September 2009 11:22:30PM *  0 points [-]

I think you're missing a term in your second calculation. And why are anthropism and copies of you necessary for this puzzle. I suspect the answer will indicate something I'm completely missing about this series.

Take this for straight-up probability:

I have two jars of marbles, one with 18 green and 2 red, the other with 18 red and two green. Pick one jar at random, then look at one marble from that jar at random.

If you pick green, what's the chance that your jar is mostly green? I say 90%, by fairly straightforward application of bayes' rule.

I offer a wager: you get $1 per green and lose $3 per red marble in the jar you chose.

After seeing a green marble, I think your EV is $5.60. After seeing a red marble, I think your EV is $0 (you decline the bet). If you are forced to make the wager before seeing anything, conditional on drawing green, I think your EV is $2.80. I calculate it thus: 50% to get mostly-green jar, and 90% of that will you see green and take the bet, which is worth +$1*18 - $3*2 in this case. 50% to get mostly-red, 10% of which will you draw green, worth +1*2 - $3*18. 0.5 * 0.9 * (1 * 18 - 3 * 2) + 0.5 * 0.1 * (1 * 2 - 3 * 18) = 2.80, which is consistent: half the time you pick green, with EV of 5.60.

I think you left out the probability that you'll get green and take the bet in each of your 0.5 probabilities for the conditional strategy. Multiply a 0.9 to the first term and 0.1 into the second, and everything gets consistent.

Comment author: Eliezer_Yudkowsky 09 September 2009 12:28:30AM 0 points [-]

The problem is that we aren't asking one randomly selected person, we're asking all of the green ones (they have to agree unanimously for the Yes vote to go through).

Comment author: Dagon 09 September 2009 03:16:34PM 0 points [-]

Ah, I see. You're asking all the green ones, but only paying each pod once. This feels like reverse-weighting the payout, so it should still be -EV even after waking up, but I haven't quite worked out a way to include that in the numbers...

Comment author: Christian_Szegedy 08 September 2009 10:51:05PM *  0 points [-]

Assume that each agent has his own game (that is one game for each agent). That is there are overall 18 (or 2) games (depending the result of the coin flip.)

Then the first calculation would be correct in every respect, and it makes sense to say yes from a global point of view. (And also with any other reward matrix, the dynamic update would be consistent with the apriori decision all the time)

This shows that the error made by the agent was to implicitely assume that he has his own game.

Comment author: wedrifid 09 September 2009 05:31:35PM 0 points [-]

In personal conversation, Nick Bostrom suggested that a division-of-responsibility principle might cancel out the anthropic update - i.e., the paperclip maximizer would have to reason, "If the logical coin came up heads then I am 1/18th responsible for adding +1 paperclip, if the logical coin came up tails then I am 1/2 responsible for destroying 3 paperclips." I confess that my initial reaction to this suggestion was "Ewwww", but I'm not exactly comfortable concluding I'm a Boltzmann brain, either.

I would perhaps prefer to use different language in the description but this seems to be roughly the answer to the apparent inconsistency. When reasoning anthropically you must decide anthropically. Unfortunately it is hard to describe such decision making without using sounding either unscientific or outright incomprehensible

I'm rather looking forward to another Eleizer post on this topic once he has finished dissolving his confusion. I've gained plenty from absorbing the posts and discussions and more from mentally reducing the concepts myself. But this stuff is rather complicated and to be perfectly honest, I don't trust myself to not have missed something.

Comment author: rwallace 08 September 2009 11:20:45PM 0 points [-]

In this comment:

http://lesswrong.com/lw/17d/forcing_anthropics_boltzmann_brains/138u

I put forward my view that the best solution is to just maximize total utility, which correctly handles the forcing anthropics case, and expressed curiosity as to whether it would handle the outlawing anthropics case.

It now seems my solution does correctly handle the outlawing anthropics case, which would seem to be a data point in its favor.

Comment author: CarlShulman 09 September 2009 06:37:42PM 1 point [-]

Maximizing total hedonic utility fails the outlawing anthropics case: substitute hedons for paperclips.

Comment author: rwallace 09 September 2009 11:02:03PM 0 points [-]

I don't think I understand your claim here. We agree that my solution works if you measure utility in paperclips? Why do you think it fails if you measure utility in hedons?

Comment author: lavalamp 08 September 2009 09:45:38PM *  0 points [-]

How about give all of your potential clones a vote, even though you can't communicate?

So, in one case, 18 of you would say "Yes, take the bet!" and 2 would say "No, let me keep my money." In the other case, 18 would say no and two would say yes. In either case, of course, you're one of the ones who would vote yes. OK, that leaves us tied. So why not let everyone's vote be proportional to what they stand to gain/lose? That leaves us with 20 * -3 vs. 20 * 1. Don't take the bet.

(Yes, I realize half the people that just voted above don't exist. We just don't know which half...)

Comment author: ArthurB 10 September 2009 06:25:43PM -1 points [-]

As it's been pointed out, this is not an anthropic problem, however there still is a paradox. I'm may be stating the obvious, but the root of the problem is that you're doing something fishy when you say that the other people will think the same way and that your decision will theirs.

The proper way to make a decision is to have a probability distribution on the code of the other agents (which will include their prior on your code). From this I believe (but can't prove) that you will take the correct course of action.

Newcomb like problem fall in the same category, the trick is that there is always a belief about someone's decision making hidden in the problem.

Comment deleted 08 September 2009 08:21:13PM *  [-]
Comment author: orthonormal 08 September 2009 08:49:23PM 3 points [-]

The copies in red rooms don't get to vote in this setup.

Comment author: Christian_Szegedy 08 September 2009 10:38:33PM *  0 points [-]

Thanks for pointing that out. Now I understand the problem.

However, I still think that the mistake made by the agent is the implicit assumption the he is the only one influencing the outcome.

Since all of the copies assume that they solely decide the outcome, they overestimate the reward after the anthropic update (each of the copies claim the whole reward for his decision, although the decision is collective and each vote is necessary).

Comment author: orthonormal 15 September 2009 06:35:00AM 1 point [-]

By the way, please don't delete a comment if you change your mind or realize an error; it makes the conversation difficult for others to read. You can always put in an edit (and mark it as such) if you want.

I'd only delete one of my comments if I felt that its presence actually harmed readers, and that there was no disclaimer I could add that would prevent that harm.

Comment author: Christian_Szegedy 15 September 2009 06:41:47AM *  0 points [-]

OK, sorry. (In this special case, I remember thinking that your remark was perfectly understandable even without the context.)

Comment author: SforSingularity 08 September 2009 08:47:39PM -2 points [-]

Perhaps we should look at Dresher's Cartesian Camcorder as a way of reducing consciousness, and thereby eliminate this paradox.

Or, to turn it around, this paradox is a litmus test for theories of consciousness.