Morendil comments on Conditioning on Observers - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (118)
At this point, it is just assertion that it's not a probability. I have reasons for believing it's not one, at least, not the probability that people think it is. I've explained some of that reasoning.
I think it's reasonable to look at a large sample ratio of counts (or ratio of expected counts). The best way to do that, in my opinion, is with independent replications of awakenings (that reflect all possibilities at an awakening). I probably haven't worded this well, but consider the following two approaches. For simplicity, let's say we wanted to do this (I'm being vague here) 1000 times.
Replicate the entire experiment 1000 times. That is, there will be 1000 independent tosses of the coin. This will lead between 1000 and 2000 awakenings, with expected value of 1500 awakenings. But... whatever the total number of awakenings are, they are not independent. For example, one the first awakening it could be either heads or tails. On the second awakening, it only could be heads if it was heads on the first awakening. So, Beauty's options on awakening #2 are (possibly) different than her options on awakening #1. We do not have 2 replicates of the same situation. This approach will give you the correct ratio of counts in the long run (for example, we do expect the # of heads & Monday to equal the # of tails and Monday and the # of tails and Tuesday).
Replicate her awakening-state 1000 times. Because her epistemic state is always the same on an awakening, from her perspective, it could be Monday or Tuesday, it could be heads or tails. She knows that it was a fair coin. She knows that if she's awake it's definitely Monday if heads, and could be either Monday or Tuesday if tails. She knows that 50% of coin tosses would end up heads, so we assign 0.5 to Monday&heads. She knows that 50% of coin tosses would end up tails, so we assign 0.5 to tails, which implies 0.25 to tails&Monday and 0.25 to tails&Tuesday. If we generate observations from this 1000 times, we'll get 1000 awakenings. We'll end up with heads 50% of the time.
The distinction between 1 and 2 is that, in 2, we are trying to repeatedly sample from the joint probability distributions that she should have on an awakening. In 1, we are replicating the entire experiment, with the double counting on tails.
In 1, people are using these ratios of expected counts to get the 1/3 answer. 1/3 is the correct answer to the question about the long-run frequencies of awakenings preceded by heads to awakenings preceded by tails. But I do not think it is the answer to the question about her credence of heads on an awakening.
In 2, the joint probabilities are determined ahead of time based on what we know about the experiment.
Let n2 and n3 are counts, in repeated trials, of tails&Monday and tails&Tuesday, respectively. You will of course see that n2=n3. They are the same random variable. tails&Monday and tails&Tuesday are the same. It's like what Jack said about types and tokens. It's like Vladimir_Nesov said:
You said:
I don't think it matters if she has the knowledge before the experiment or not. What matters is if she has new information about the likelihood of heads to update on. If she did, we would expect her accuracy to improve. So, for example, if she starts out believing that heads has probability 1/2, but learns something about the coin toss, her probability might go up a little if heads and down a little if tails. Suppose, for example, she is informed of a variable X. If P(heads|X)=P(tails|X), then why is she updating at all? Meaning, why is P(heads)=/=P(heads|X)? This would be unusual. It seems to me that the only reason she changes is because she knows she'd be essentially 'betting' twice of tails, but that really is distinct from credence for tails.
Consider the case of Sleeping Beauty with an absent-minded experimenter.
If the coin comes up Heads, there is a tiny but non-zero chance that the experimenter mixes up Monday and Tuesday.
If the coin comes up Tails, there is a tiny but non-zero chance that the experimenter mixes up Tails and Heads.
The resulting scenario is represented in a new sheet, Fuzzy two-day, of my spreadsheet document.
Under these assumptions, Beauty may no longer rule out Tuesday & Heads. She has no justification to assign all of the Heads probability mass to Monday & Heads. She is therefore constrained to conditioning on being woken in the way that the usual two-day variant suggests she should, and ends up with a credence arbitrarily close to 1/3 if we make the "absent-minded" probability tiny enough.
Why should we get a discontinuous jump to 1/2 as this becomes zero?
This sounds like the continuity argument, but I'm not quite clear on how the embedding is supposed to work, can you clarify? Instead of telling me what the experimenter rightly or wrongly believes to be the case, spell out for me how he behaves.
What does this mean operationally? Is there a nonzero chance, let's call it epsilon or e, that the experimenter will incorrectly behave as if it's Tuesday when it's Monday? I.e., with probability e, Beauty is not awoken on Monday, the experiment ends, or is awoken and sent home, and we go on to next Sunday evening without any awakenings that week? Then Heads&Tuesday still with certainty does not occur. So maybe you meant that on Monday, he doesn't awaken Beauty at all, but awakens her on Tuesday instead? Is this confusion persistent across days, or is it a random confusion that happens each time he needs to examine the state of the coin to know what he should do?
And on Tuesday
So when the coin comes up Tails, there is a nonzero probability, let's call it delta or d, that the experimenter will incorrectly behave as if it's Heads? I.e., on Tuesday morning, he will not awaken Beauty or will wake her and send her home until next Sunday? Then Tails&Tuesday is a possible nonoccurrence.
On reflection, my verbal description doesn't rmatch the reply I wanted to give, which was: the experimenter behaves such that the probability mass is allocated as in the spreadsheet.
Make it "on any day when Beauty is scheduled to remain asleep, the experimenter has some probability of mistakenly waking her, and vice-versa".
This is interesting. We shouldn't get a discontinuous jump.
Consider 2 related situations:
if Heads she is woken up on Monday, and the experiment ends on Tuesday. If tails, she is woken up on Monday and Tuesday, and the experiment ends on Wed. In this case, there is no 'not awake' option.
If heads she is woken up on Monday and Tuesday. On Monday she is asked her credence for heads. On Tuesday she is told "it's Tuesday and heads" (but she is not asked about her credence; that is, she is not interviewed). If tails, it's the usual woken up both days and asked about her credence. The experiment ends on Wed.
In both of these scenarios, 50% of coin flips will end up heads. In both cases, if she's interviewed she knows it's either Monday&heads, Monday&tails or Tuesday&tails. She has no way of telling these three options apart, due to the amnesia.
I don't think we should be getting different answers in these 2 situations. Yet, I think if we use your probability distributions we do.
I think there are two basic problems. One is that Monday&tails is really not different from Tuesday&tails. They are the same variable. It's the same experience. If she could time travel and repeat the monday waking it would feel the same to her as the Tuesday waking. The other issue is that, even though in my scenario 2 above, when she is woken but before she knows if she will be interviewed, it would look like there is a 25% chance it's heads&Monday and a 25% it's heads&Tuesday. And that's probably a reasonable way to look at it. But, that doesn't imply that, once she finds out it's an interview day, that the probability of heads&Monday shifts to 1/3. That's because on 50% of coin flips she will experience heads&Monday. That's what makes this different than a usual joint probability table representing independent events.
My reasoning has been to consider scenario 1 from the perspective of an outside observer, who is uncertain about each variable: a) whether it is Monday or Tuesday, b) how the coin came up, c) what happened to Beauty on that day.
To that observer, "Tuesday and heads" is definitely a possibility, and it doesn't really matter how we label the third variable: "woken", "interviewed", whatever. If the experiment has ended, then that's a day where she hasn't been interviewed.
If the outside observer learns that Beauty hasn't been interviewed today, then they may conclude that it's Tuesday and that the coin came up heads, thus a) they have something to update on and b) that observer must assign probability mass to "Tuesday & Heads & not interviewed".
If the outside observer learns that Beauty has been interviewed, it seems to me that they would infer that it's more likely, given their prior state of knowledge, that the coin came up heads.
To the outside observer, scenario 2 isn't really distinct from scenario 1. The difference only makes a difference to Beauty herself.
However, I see no reason to treat Beauty herself differently than an outside observer, including the possibility of updating on being interviewed or on not being interviewed.
So, if my probability tables are correct for an outside observer, I'm pretty sure they're correct for Beauty.
(My confidence in the table themselves, however, has been eroded a little by my not being able to calculate Beauty - or an observer - updating on a new piece of information in the "fuzzy" variant, e.g. using P(heads|woken) as a prior probability and updating on learning that it is in fact Tuesday. It seems to me that for the math to check out requires that this operation should recover the "absent-minded experimenter" probability for "tuesday & heads & woken". But I'm having a busy week so far and haven't had much time to think about it.)