gjm comments on Open thread, Nov. 23 - Nov. 29, 2015 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (257)
I think this is not quite right, and it's not-quite-right in an important way. It really isn't true in any sense that "it's more likely that you'll alternate between heads and tails". This is a Simpson's-paradox-y thing where "the average of the averages doesn't equal the average".
Suppose you flip a coin four times, and you do this 16 times, and happen to get each possible outcome once: TTTT TTTH TTHT TTHH THTT THTH THHT THHH HTTT HTTH HTHT HTHH HHTT HHTH HHHT HHHH.
What's going on here isn't any kind of tendency for heads and tails to alternate. It's that an individual head or tail "counts for more" when the denominator is smaller, i.e., when there are fewer heads in the sample.
My intuition is from the six points in Kahan's post. If the next flip is heads, then the flip after is more likely to be tails, relative to if the next flip is tails. If we have an equal number of heads and tails left, P(HT) > P(HH) for the next two flips. After the first heads, the probability for the next two might not give P(TH) > P(TT), but relative to independence it will be biased in that direction because the first T gets used up.
Is there a mistake? I haven't done any probability in a while.
No, that is not correct. Have a look at my list of 16 length-4 sequences. Exactly half of all flips-after-heads are heads, and the other half tails. Exactly half of all flips-after-tails are heads, and the other half tails.
The result of Miller and Sanjuro is very specifically about "averages of averages". Here's a key quotation:
"The relative frequency [average #1] is expected [average #2] to be ...". M&S are not saying that in finite sequences of trials successes are actually rarer after streaks of success. They're saying that if you compute their frequency separately for each of your finite sequences then the average frequency you'll get will be lower. These are not the same thing. If, e.g., you run a large number of those finite sequences and aggregate the counts of streaks and successes-after-streaks, the effect disappears.