This is the eighth post in my series on Anthropics. The previous one is Lessons from Failed Attempts to Model Sleeping Beauty Problem. The next one is Beauty and the Bets.
Introduction
Suppose we take the insights from the previous post, and directly try to construct a model for the Sleeping Beauty problem based on them.
We expect a halfer model, so
On the other hand, in order not repeat Lewis' Model's mistakes:
But both of these statements can only be true if
And, therefore, apparently, has to be zero, which sounds obviously wrong. Surely the Beauty can be awaken on Tuesday!
At this point, I think, you wouldn't be surprised, if I tell you that there are philosophers who are eager to bite this bullet and claim that the Beauty should, indeed, reason as if she can't possible be awoken on Tuesday. I applaud their dedication to brute-forcing the problem, and leave them at that, preferring to follow less bizarre approaches.
As I mentioned in the previous post, we need to find the core flawed assumption in all three models and fix it before we will be able to construct the correct model. That's what we are going to do now.
On the other hand, this drawback may create an impression that my analysis from the previous post is wrong, and that's the reason why my insights are leading to a dead end. That actually one of the three models is sound for the Sleeping Beauty problem, despite all the issues mentioned previously.
Thankfully, there is a way to kill both birds with one stone.
Statistical Analysis
Let's simulate Sleeping Beauty experiment multiple times and write down the day and the coin side on every awakening. Here is an implementation in Python:
def sleepingBeauty():
days = ['Monday']
if random.random() >= 0.5: # result of the coin toss
coin = 'Tails'
days += ['Tuesday']
else:
coin = 'Heads'
return days, coin
ListSB = []
for i in range(n):
days, coin = sleepingBeauty()
for day in days:
ListSB.append(coin+'&'+day)
And then do the same with all the models.
def modelElga():
rand = random.random()
if rand <= 0.33
return 'Heads&Monday'
elif rand <= 0.66:
return 'Tails&Monday'
else:
return 'Tails&Tuesday'
ListE = []
for i in range(1.5*n):
outcome = modelElga()
ListE.append(outcome)
def modelLewis():
if random.random() >= 0.5: # result of the coin toss
coin = 'Tails'
day = 'Monday' if random.random() >= 0.5 else 'Tuesday'
else:
coin = 'Heads'
day = Monday
return coin+'&'+day
ListL = []
for i in range(1.5*n):
outcome = modelLewis()
ListL.append(outcome)
def modelUpdating():
day = 'Monday' if random.random() >= 0.5 else 'Tuesday'
coin = 'Heads' if random.random() >= 0.5 else 'Tails'
outcome = coin+'&'+day
if outcome != 'Heads&Tuesday':
return outcome
ListU = []
for i in range(1.5*n):
outcome = modelUpdating()
if outcome:
ListU.append(outcome)
As a result, we've got four lists of outcomes: ListSB, ListE, ListL and ListU.
If one of the models accurately represents Sleeping Beauty problem, we shouldn't be able to easily distinguish between ListSB and the list produced by this model. They wouldn't be identical, of course, since we are talking about random events, but their statistical properties have to be similar, so that if a person was given either of them, it wouldn't be possible to guess which was which better than chance.
It's fairly easy to see that ListL immediately fails this test. Frequency of Tails&Monday and Tails&Tuesday outcomes in the ListSB is around 1/3, while in ListL it's 1/4 for both of them. But ListE and ListU are not much better. Yes, their frequencies for individual outcomes are correct. And yet, these lists are still easily distinguishable from the target.
In ListE and ListU, all three outcomes: Heads&Monday, Tails&Monday, Tails&Tuesday are spread randomly. But in the ListSB outcome, Tails&Tuesday is always preceded by Tails&Monday. This is because all three models assume that Beauty's previous and next awakenings happen independently of each other, while in the experiment itself, there is an order between them.
So, of course these models are applicable to some different problems where the condition of independence is satisfied. And now we should be able to see the wrong assumption in all three models. They treat current awakening as somehow sampled, assuming that Monday and Tuesday and, therefore, Heads&Monday, Tails&Monday, Tails&Tuesday are mutually exclusive outcomes from which a sample space can be constructed. However, they are not. Monday and Tuesday are ordered and thus not mutually exclusive.
Probability and Time
It may not be immediately clear why Monday and Tuesday are not mutually exclusive in Sleeping Beauty. It intuitively feels that they have to be exclusive because they are separated by time. But this is a confusion between a layman understanding of mutual exclusivity and a probability theoretic one. And the core difference is in how our intuition and probability theory treat the concept of time.
For us, time is a continuous stream, where the next moment depends on the previous one due to causality.
For probability theory, time is discrete and every moment is independent from another. It's a purely logical entity without any physical connotations. "Next moment in time" simply means "next iteration of the experiment".
Mutually exclusive events, according to probability theory, are events that can't happen in the same iteration of the experiment - exclusive in "logical time" and not "physical time".
Consider the Single-Awakening problem from the previous post. We can reformulate it so that instead of Monday and Tuesday awakenings there were awakenings in two different rooms:
If the coin comes Heads, Sleeping Beauty is awakened in Room 1. If the coin comes Tails, Sleeping Beauty is either awakened in Room 1 with 50% chance, or in Room 2, otherwise.
Even if we make sure to awaken the Beauty at the same physical time on Heads and on Tails the outcomes Heads&Room1, Tails&Room1 and Tails&Room2 are still mutually exclusive for the sake of probability theory. Uncertainty about physical time is treated by probability theory exactly the same way as uncertainty about physical space. What really matters is that only one particular outcome happens in a particular iteration of experiment.
Likewise, consider this elementary case:
If the coin comes Heads, you get five dollars on the next day.
Is getting five dollars mutually exclusive with coin being Heads? Of course not! On the contrary, it's a direct consequence of it. The fact that there is a time delay obviously is irrelevant from probability theoretic standpoint.
So why the confusion with Sleeping Beauty problem? Why do we suddenly decide that Tails&Monday and Tails&Tuesday are mutually exclusive?
Effects of Amnesia
Some level of confusion is definitely added by the fact that Sleeping Beauty is experiencing amnesia between awakenings. I hope that no one seriously assumes that it somehow changes the statistical properties of the setting. But it may seem that the Beauty has to reason as if Monday and Tuesday are exclusive, anyway, based on the knowledge available to her.
After all, the Beauty doesn't know which awakening is which, she can't distinguish them better than chance. And probability is in the mind, it's about modelling uncertainty. In case of lack of any information, we default to the equiprobable prior.
Such reasoning would be correct if Beauty actually didn't have any information about her awakenings, except the fact that there are three distinct, though indistinguishable for her, possibilities. Then the situation would be isomorphic to the No-Coin-Toss problem, and using Elga's Model to describe Beauty's knowledge state would be appropriate.
But in our case, the Beauty is well aware about the setting of the experiment. She knows that her awakening routine is determined by the coin toss. She knows that Tails&Monday and Tails&Tuesday awakenings follow sequentially. In spite of memory loss, this information is fully available to the Sleeping Beauty. And as a good Bayesian, she has to use all the relevant available information and not delude herself into thinking that her current awakening happens independently from the next/previous one.
Failure of Centred Possible Worlds
How comes philosophers were missing all this for decades? Well, they weren't exactly. As always, they figured out a clever sounding reason to disregard these concerns. Citing a footnote from Bayesian Beauty by Silvia Milano:
In a sense, speaking about refining the state space may seem suspicious: after all, ws1 and ws2 both happen (sequentially) if the result of the coin toss is Heads. So, from an atemporal point of view, they are not mutually exclusive. However, here we are not interested in the atemporal viewpoint, but in the temporally located viewpoint that Beauty occupies at the time that she considers the problem. From this temporally located perspective, ws1 and ws2 are indeed mutually exclusive.
Suspicious is an understatement. The jump to such a conclusion immediately raises several serious questions.
How come our interest in something else is affecting the statistical properties of events? Why do we think that Beauty has to consider the problem from a "temporally located viewpoint"? Where does all this talk about temporal and atemporal points even come from, considering that probability theory doesn't have any special way to deal with physical time? Thankfully, the answers can be found in the next footnote of the same paper.
The indexical states within the state space Ω′ can be interpreted as centred worlds. Using centred worlds to capture the content of self-locating propositions is a standard move in the philosophical literature Lewis (1979).
As far as I can tell, David Lewis' Attitudes De Dicto and De Se, which Bayesian Beauty is citing, is the source of all our troubles. There is a lot I could say about this paper, arguing against its individual points one by one, and maybe someday I will, after all I can definitely see how many a confusion of anthropics originate from it.
But whatever its general philosophical merits, right now we are interested in something very specific: the justifications of the probability theory manipulations that since became a "standard move in the philosophical literature".
And there is not much. If, just like me, you've hoped to see at least some math in the paper, you'd be disappointed. David Lewis' reasoning on this matter can be abridged to a simple "Why not?":
We can have beliefs whereby we locate ourselves in logical space. Why not also beliefs whereby we locate ourselves in ordinary time and space? We can self-ascribe properties of the sort that correspond to propositions. Why not also properties of the sort that don't correspond to propositions? We can identify ourselves as members of subpopulations whose boundaries follow the borders of the worlds. Why not also as members of subpopulations whose boundaries don't follow the borders of the worlds?
Why not? No reason! We can and we do have beliefs whereby we locate ourselves in ordinary time and space; whereby we self-ascribe properties that don't correspond to propositions; and whereby we identify ourselves as members of subpopulations
whose boundaries don't follow the borders of the worlds.
Don't get me wrong, that's a fair question, that deserves a fair answer. Which at this point should be clear - by doing so you may end up contradicting probability theory.
And if you want to expand probability theory somehow, to lawfully be able to do this new exciting thing that you want to do, well then, it's expected to engage with the math, state new axioms and prove theorems and be generally more substantial than asking "Why not?"
But this wasn't the goal of David Lewis. He was writing a philosophy paper, not a math one. Maybe he expected that the formal mathematical justifications would be easily made later. Maybe he didn't even understand that he was contradicting probability theory as it is. The closest to reasoning about mathematics is this:
Then it is interesting to ask what happens to decision theory if we take all attitudes as de se. Answer: very little. We replace the space of worlds by the space of centered worlds, or by the space of all inhabitants of worlds. All else is just as before. Whatever the points of the space of possibilities may be, we have probability distributions over the space and assignments of utility values to the points. For any rational agent at any time there is a pair of a probability distribution and a utility assignment. The probabilities change under the impact of his perception; the probabilities and utilities jointly govern his action. His degrees of belief at a time are got by taking the total probability of regions of the space; his degrees of desirability are got by integrating the point-by-point utilities, weighted by probability, over regions of the space. But since the space of possibilities is no longer the space of worlds, its regions to which degrees of belief and desirability attach are no longer
propositions. Instead they are properties.
There is some ambiguity here.
Probability theory isn't based on philosophical concept of "possible worlds" and their metaphysical reality that philosophers speculate about. It is based on a triplet , where is the sample space of elementary outcomes, is some sigma algebra of it and is the measure function, the domain of which is .
As long the outcomes in are mutually exclusive - everything goes. Once again, remember two versions of Single-Awakening Problem - the one where the awakenings happen on different days and the one where they happen in different rooms.
The same mathematical model describes both of them. Philosophers may say that the first type of uncertainty is "temporal" while the second is "spatial". But for the sake of probability theory such semantics doesn't matter in the slightest. Math stays correct regardless of what you call your variables.
So, if by "replace the space of worlds by the space of centered worlds" Lewis meant being able to talk about such cases of "temporal uncertainty" then of course nothing really changes. In this case no rewriting of probability theory is required and everything is quite trivial.
But consider these problems instead:
You are to awake on Monday and on Tuesday. You are currently awake. What is the probability that it's Monday?
You are to be put in Room 1 and then Room 2. You are currently in a room. What is the probability that it's Room 1?
Do you see the difference? Here both outcomes happen, not only one of them, so they are not mutually exclusive. In order to be able to apply probability theory we need to specify what currently means. Are we talking about the first moment in time? The second one? A random one? The one selected according to some specific rule? The answer is different, depending on how we remove this ambiguity, and there is no answer until we do it.
Another way to interpret "replace the space of worlds by the space of centered worlds", the kind of interpretation that we can observe in all the failed attempts to model the Sleeping Beauty problem, is to replace sample space with , which consists of non-mutually-exclusive outcomes, and yet treat them as if they are. To claim that in case of "temporal uncertainty" one has to always assume that current time moment means random time moment.
I'm not quite sure that philosophers generally understand the difference between these two cases. When you talk about them in philosophical terms, both are just "temporal uncertainty", so it's very easy to confuse them and decide that they are one and the same. And if we can use probability theory in one case, then indeed, why not in the other?
Well, here is why not. Because in the latter case you are making baseless assumptions and contradicting the axioms of probability theory. Even if you say such magic words as "centred possible worlds", "de se attitudes" and "temporal perspective". And when this happens you are to encounter paradoxes. Which is exactly what happened.
In a saner world, people would treat the Sleeping Beauty paradox as a clear demonstration that Lewis' ideas do not work. A classic case of proof by contradiction. As soon as we accept the premise, we can observe that a fair coin becomes unfair, or we get an update for the probability estimate despite receiving no new information, or an ability to predict the outcome of a fair coin better than chance.
In our world people just can't stop begging the question. They accept the framework and then assume that Sleeping Beauty is the kind of problem that has to be solved using it, that Beauty has to reason from a "temporal perspective", despite there being no justification for it, despite all the absurdity that follows. Some claim that the Beauty has to treat her Monday and Tuesday awakenings as two different people. Some go as far to claim that we should reason as if our every observer moment is randomly sampled!
Math vs Intuition
Why are people so eager to trust that we can use probability theory with such a temporal perspective? Once again, because it fits our naive intuitions.
If I forget what is the current day of the week in my regular life, well, it's only natural to start from a 1/7 prior per day and work from there. I can do it because the causal process that leads to me forgetting such information can be roughly modeled as a low probability occurrence which can happen to me at any day.
It wouldn't be the case, if I was guaranteed to also forget the current day of the week on the next 6 days as well, after I forgot it on the first one. This would be a different causal process, with different properties - causation between forgetting - and it has to be modeled differently. But we do not actually encounter such situations in everyday life, and so our intuition is caught completely flat footed by them.
Consider the assumption that on an awakening Sleeping Beauty learns that "she is awoken today". What does it actually mean? A natural interpretation is that Beauty is awoken on Monday xor Tuesday. It's easy to see why it's true for Single-Awakening and No-Coin-Toss problems. In every iteration of experiment if the Beauty is awakened on Monday she is not Awakened on Tuesday and vice versa.
But it doesn't stand true for Sleeping Beauty problem, where individual awakenings do not happen independently. On Tails both Monday and Tuesday awakening happen, so the Beauty can't possibly learn that she is awoken on Monday xor Tuesday - this statement is wrong in 50% of cases. What Beauty actually learns is that "she is awoken at least once" - on Monday and (Tuesday or not Tuesday).
And yet, the assumption that the Beauty should be able to reason about her individual awakenings feels so natural, so intuitive! It doesn't even seem right to question it. We are so used to situations where different physical moments in time can be lawfully modeled by probability theory that we assume that it always has to be the case. That math has to care about our identity, of first person perspective, or feeling of passage of time.
But math doesn't care. It's just a truth preserving mechanism and nothing more. The moment you've bent the rules, even a tiny bit, even when it really feels like the rules have to be different, you are not preserving the truth anymore. Not arriving at the correct conclusion. There is no arguing around that. Just give up, find the mistake and fix it.
Correct Model for the Sleeping Beauty Problem
Now, with all these things considered, we are finally ready to construct the correct model.
As previous and future awakenings are not independent, we can't use awakenings as elementary outcomes. But we can talk about the experiment as a whole. And as soon as we do it - everything adds up to normality.
Previously we were confused how can be equal to 1. We thought that it means that - an obviously wrong statement.
But now we understand that Monday and Tuesday are not mutually exclusive events. More specifically, events "This awakening happens during Monday/Tuesday" are ill-defined, but events "In this experiment Monday/Tuesday awakening happens" have coherent probabilities.
We can talk about conditional probabilities and intersections between Monday and Tuesday events.
Meanwhile, the sample space for the coin toss is obvious
We can combine it with :
Monday awakening happens regardless of the result of the coin toss, so:
Likewise we can combine and :
Tuesday awakening happening in the experiment means that the coin is definitely Tails and vice versa
Which, by the way, is the reason why Elga's proof for thirdism doesn't work.
By combining all the sample spaces together:
Which is equal to the sample space of the coin toss, just with different names for the outcomes. And no wonder - the outcome of the coin is the only unknown element in the setting.
, , , - are all the exact same outcome. Likewise, and are the same.
On every awakening the Beauty always learns that she is awake and that Monday awakening has happened, just as she has initially expected.
Whether Monday is today or was yesterday is irrelevant - it is the same outcome, anyway. There is no new information and so the Beauty doesn't update her probability estimate for the coin toss outcome.
Once again with an "anthropic problem", as long as we are actually following probability theory faithfully, without trying to manipulate it in favor of our flawed intuitions, everything is absolutely obvious. No fair coin suddenly becoming unfair. No precognitive powers. No contradiction of conservation of expected evidence. No mystery to write philosophical papers about for decades.
And no two valid answers either. We can now strictly say that a per experiment-based scoring rule produces valid probabilities for the Sleeping Beauty setting, while per awakening does not, because it counts the same elementary outcome twice.
The correct model represents the only lawful way to reason about the Sleeping Beauty problem without smuggling unjustified assumptions that the current awakening is randomly sampled. It passes the statistical test with flying colors, as it essentially reimplements the sleepingBeauty() function:
def modelCorrect():
if random.random() >= 0.5: # result of the coin toss
return ['Heads&Monday']
else:
return ['Tails&Monday','Tails&Tuesday']
ListC = []
for i in range(n):
outcome = modelCorrect()
ListC += outcome
It also correctly deals with every betting scheme, be it per day or per awakening. And in the next post we will specifically explore the topic of betting in regards to the Sleeping Beauty problem.
The next post in the series is Beauty and the Bets.
This is the crux of our disagreement.
The Beauty doesn't know only about one pass she knows about their relation as well. And because of it she can't reason as if they happen at random. You need to address this point before we could move on, because all your further reasoning is based on the incorrect premise that beauty knows less than she actually knows.
She absolutely has this ability as long as she knows the procedure, that TT and TH follow in pairs, she can make such conditional statements: "if the coins are currently TT then they either will be TH tomorrow or were TH yesterday". It's very different from not knowing anything whatsoever about the state of the coin on the next day. The fact that you for some reason feel that it should not matter is irrelevant. It's still clearly more than no information whatsoever and, therefore, she can't justifiably reason as if she doesn't have any.
On the other hand, if the memory wipe removed this knowledge from her head as well, if the only thing she truly knew was that she is currently awakened at one of three possible states either TH, HT and TT, and had no idea of the relationship between them, then, only then, she would be justified to reason as you claim she should.
No, I treat is as an event that Beauty doesn't expect to observe and therefore she doesn't update when she indeed doesn't observe it according to the law of conservation of expected evidence. We are talking about Beauty's perspective after all, not a some outside view.
Suppose an absolutely trustwothy source tells you that the coin is Heads side up. Then you go and look at the coin and indeed it's Heads side up. What should have been your probability that the coin is Tails side up before you looked at it?
It should be zero. You've already known the state of the coin before you looked at it, you got no new information. Does it mean that Tails side of a coin doesn't exist? No, of course not! It just that you didn't expect that the coin could possibly be Tails in this particular case based on your knowledge state.
I was going to post a generalized way of reasoning under amnesia in a future post, but here is some: getting memory erased about some evidence just brings you to the state where you didn't have this particular evidence. And getting an expected memory wipe can only make you less confident in your probability estimate, not more.
In this dice rolling case, initially my P(6) = 1/6, then after you told me that it's odd, P(6|Odd)=0, and then when I'm memory wiped I'm back to P(6) = 1/6 and the knowledge that you've already told me whether the result is even or odd doesn't help P(6|Even or Odd) = 1/6
Likewise in Sleeping Beauty I initially have P(Heads) = 1/2. Then I awakened exactly as I've expected in the experiment and still have P(Heads|Awake) = 1/2. Now suppose that I'm awakened once more. If there was no memory wipe I'd learn that I'm a awake a second time which would bring me to P(Heads|Two Awakenings) = 0. But I do not get this evidence due to memory wipe. So due to it, when I'm awakened the second time, I once again learn that I'm awake and still having P(Heads|Awake) = 1/2.
What you are implicitly claiming, however, is that getting memory wiped, or even just a possibility of it, makes the Beauty more confident in one outcome over the other! Which is quite bizarre. As if knowing less gives you more knowledge. Moreover, you assume that the person who knowns that their memory was/may be erased, just have to act as if they do not know it.
Suppose a coin is tossed and you received some circumstantial evidence about it's state. As a result you are currently at 2/3 in favor of Heads. And then I tell you: "What odds are you ready to bet on? By the way, I have erased from your memory some crucial evidence in favor of Tails". Do you really think that you are supposed to agree to bet on 1:2 odds even though you now know that the state of the evidence your currently have may not be trustworthy?