Let's say you're considering an activity with a risk of death of one in a million. If you do it twice, is your risk two in a million?
Technically, it's just under:
1 - (1 - 1/1,000,000)^2 = ~2/1,000,001This is quite close! Approximating
1 - (1-p)^2
as
p*2
was only off by 0.00005%.
On the other hand, say you roll a die twice looking for a 1:
1 - (1 - 1/6)^2 = ~31%The approximation would have given:
1/6 * 2 = ~33%Which is off by 8%. And if we flip a coin looking for a tails:
1/2 * 2 = 100%Which is clearly wrong since you could get heads twice in a row.
It seems like this shortcut is better for small probabilities; why?
If something has probability p
, then the chance of it
happening at least once in two independent tries is:
1 - (1-p)^2 = 1 - (1 - 2p + p^2) = 1 - 1 + 2p - p^2 = 2p - p^2If
p
is very small, then
p^2
is negligible,
and
2p
is only a very slight overestimate. As it gets
larger, however, skipping it becomes more of a problem.
This is the calculation that people do when adding micromorts: you can't die from the same thing multiple times, but your chance of death stays low enough that the inaccuracy of naively combining these probabilities is much smaller than the margin of error on our estimates.
Comment via: facebook
You got the wrong answer, but I do like the idea of comparing variances, and at least for this distribution, whichever has greater variance will have more weight on 0. But in this case, the variance of the 50% option is 0.5 and the variance of the 5% option is 0.95. And indeed the 5% option is preferable. (Binomial(n,p) has variance np(1−p), if the means np are the same then whichever has lower p will have higher variance.)