There is one thing I don't understand about probabilities:

If we toss a coin, there is a 50% chance that it shows heads or tails. If we do it 20 times and all of them showed heads, there is still a 50% chance that the next one shows heads, since the tosses are independent. However, we also know that series of X tosses showing heads are increasingly improbable when X grows. So, although there is a 50% chance that the toss shows heads again, at the same time the probability that it shows heads again are lower.

Why do we have to take into account one piece of information and not the other one when finding the probability that the next toss will show heads or tails? Are there 2 (or more) types of probabilities and I am just mixing them up (I'm thinking on things like the reported "probabilities" that polls show about one party or another getting elected in an election, for example)? Is the difference related to ergodicity (time vs ensemble averages)?

New Answer
New Comment

4 Answers sorted by

Chloe Thompson

73

Ah yes this was confusing to me for a while too, glad to be able to help someone else out with it!

The key thing to realise for me, is that the probability of 21 heads in a row changes as you toss each of those 21 coins.

The sequence of 21 heads in a row does indeed have much less than 0.5 chance, to be precise ,   which is 0.000000476837158.  But it only has such a tiny probability before any of those 21 coins have been tossed. However as soon as the first coin is tossed, the probability of those 21 coins all being heads changes. If first coin is tails, the probability of all 21 coins being heads goes down to 0, if first coin is heads the probability of all 21 coins being heads goes up to . Say you by unlikely luck keep tossing heads. Then with each additional heads in a row you toss, the probability of all 21 coins being heads goes steadily up and up, til by the time you've tossed 20 heads in a row, the probability of all 21 being heads is now.... 0.5, i.e. the same as a the probability of a single coin toss being heads! And our apparent contradition is gone :)

The more 'mathematical' way to express this would be: The unconditional probability of tossing 21 heads in a row is , i.e. 0.000000476837158 but the probability of tossing 21 heads in a row conditional on having already tossed 20 heads in a row is .



Let me know if any of that is still confusing.

 

I think you explain it very well!

 

So the thing is something like the following, right?: "Looking at it from the outside, a world where 21 heads showed in a row is incredibly unlikely: (if the coin is fair) I would happily bet against this world happening. However, I am already in an incredibly weird world where 20 heads have shown in a row, and another heads only makes it a bit more weird, so I don't know what to bet, heads or tails."

3noggin-scratcher
Yes, essentially. While 21 heads in a row is very unlikely (when you consider it ahead of flipping any coins), by the time you get to 20 heads in a row most of the unlikely-ness of it has already happened, with the odds of one more head remaining the same as ever.
2Chloe Thompson
Yep that's it! Glad my explanation helped. (Though if we want to be a bit pedantic about it, we'd say that actually a world where 21 heads in a row ever happens is not unlikely (If heaps and heaps of coin tosses happen across the world over time, like in our world), but a world where any particular given sequence of 21 coin flips is all heads is yes very unlikely (before any of them have been flipped)).)
2rur
The advice is: do not bet. Suppose you download a gambling app that bets on games where the outcome is similar to a coin flip. You start receiving emails from someone associated with the app (so they bypass your spam filters). Each day for 20 days you receive an email predicting the outcome of the game. Each of the 20 predictions is correct. What do you do? Nothing. What you are unaware of (but should suspect) is that on the first email, the sender has sent out 8 million emails making a prediction (it is a popular gambling app). 4 million of those predicted the home team wins and the other 4 million predicted the visiting team wins. The next day the emails only goes out to those that received the correct prediction. Rinse. Repeat. And you happen to be an (un)lucky recipient of the 21st email distribution. The world you live in is no weirder than the world a Powerball Lottery winner lives in. 
1mikbp
That's a nice example. I heard about it long ago with investments instead of games. It is really something important to keep in mind!
[+][comment deleted]10

Viliam

50

A sequence of 100 heads is only half as likely as a sequence of 99 heads. Which is why the probability of the 100th coinflip being head is exactly one half.

Dagon

40

One way to think of this: Uncertainty (at least on this level) is in the observer, not the coin. It comes up heads or comes up tails, with 100% chance of the thing that actually happens.

Before the flip, you assign 50% to each outcome, but that’s your uncertainty, not the coin’s, and the result may as well be secretly predetermined by the universe. After you’ve seen 20 heads, that part is now probability 100% (it’s knowledge, not uncertainty, on your part), and the next flip is still 50/50 (to your knowledge, presuming you have reason to trust the coin and not update toward an unfair flipper).

JBlack

30

Are there 2 (or more) types of probabilities and I am just mixing them up

Yes, there are conditional probabilities and unconditional probabilities.

The unconditional probability of 21 heads in a row is 0.5^21[1].
The conditional probability of 21 heads in a row given that the first 20 were all heads is 0.5.

Conditional probability is just a division: the conditional probability of some event A given that B happened is just the unconditional probability of both A and B divided by the probability of B. In symbols: P(A | B) = P(A & B) / P(B).

Bayes' Law comes from simple algebra on this.

  1. ^

    As is common, this assumes that the coin flips are independent of one another. An alternative might be that the coin was flipped "lazily" such that it more often shows the same face as the previous flip, but over the long run still flips 50% heads. A "properly" flipped coin should not depend upon the results of any or all previous flips.