That's crazy how close that is. (to the nearest half a percent) will be a fun fact that I remember now!
Conversely, if you don't see any success after 3n attempts you have a 95% confidence interval that 0 < p < 1/n (unless you have a strong prior)
https://en.wikipedia.org/wiki/Rule_of_three_(statistics)
In the infinite limit (or just large-ish x), the probability of at least one success, from nx attempts with 1/x odds on each attempt, will be 1 - ( 1 / e^n )
For x attempts, 1 - 1/e = 0.63212
For 2x attempts 1 - 1/e^2 = 0.86466
For 3x attempts 1 - 1/e^3 = 0.95021
And so on
Cool. Is this right? For something with a 1/n chance of success I can have a 95% chance of success by making 3n attempts, for large values of n. About what does "large" mean here?
95% is a lower bound. It's more than 95% for all numbers and approaches 95% as n gets bigger. If n=2 (E.G. a coin flip), then you actually have a 98.4% chance of at least one success after 3n (which is 6) attempts.
I mentioned this in the "What I'm not saying" section, but this limit converges rather quickly. I would consider any to be "close enough"
Ironically, the even more basic error of probabilistic thinking that people so—painfully—commonly make ("It either happens or doesn't, so it's 50/50") would get closer to the right answer.
Is that error common? I can only recall encountering one instance of it with surety, and I only know about that particular example because it was signal-boosted by people who were mocking it.
I know someone who taught math to low-ability kids, and reported finding it difficult to persuade them otherwise. I assume some number of them carried on into adulthood still doing it.
I feel that even an underachieving student can understand that the probability of winning the lottery is not 50/50. I can't imagine that many of those kids carried that fallacious thinking into adulthood.
I suspect there has to be a degree of mental disconnect, where they can see that things don't all happen (or not happen) equally as often as each other, but answering the math question of "What's the probability?" feels like a more abstract and different thing.
Maybe mixed up with some reflexive learned helplessness of not really trying to do math because of past experience that's left them thinking they just can't get it.
Possibly over generalising from early textbook probability examples involving coins and dice, where counting up and dividing by the number of possible outcomes is a workable approach.
I agree with your point about there being a 'mental disconnect'. It seems to be less of an issue with understanding the concept of two events not being equally likely to occur, but rather an issue with applying mathematical reasoning to an abstract problem. If you can't find the answer to that problem, you are likely to use the seemingly plausible but incorrect reasoning that 'it either happens or doesn't, so it's 50/50.' This fallacy could be considered a misapplication of the principle of insufficient reason.
Nice.
Similar rule of thumb I find handy is 70 divided by growth rate to get doubling time implied by a growth rate. I find it way easier to think about doubling times than growth rates.
E.g. 3% interest rate means 70/3 ≈ 23 year doubling time.
Nice. I have a suggestion how to improve the article. Put a clearly stated theorem somewhere in the middle, in its own block, like in academic math articles.
In case anyone’s wondering, if there’s a 1/n chance of something happening each time (iid), and you try n times (for large n), then it will happen m times with probability . So 0,1,2,3… hits would be 36.8%, 36.8%, 18.4%, 6.1%, 1.5%, 0.3%, … Nice how it sums to one.
(For the general formula, i.e. where the probability is not necessarily 1/n, see: poisson distribution.)
Relatedly, if you perform an experiment n times, and the probability of success is p, and the expected number of total successes kp is much smaller than one, then kp is a reasonable measure of getting at least once success, because the probability of getting more than one success can be neglected.
For example, if Bob plays the lottery for ten days, and each days has a 1:1000,000 chance of winning, then overall he will have a chance of 100,000 of winning once.
This is also why micromorts are roughly additive: if travelling by railway has a mortality of one micomort per 10Mm, then travelling for 50Mm will set you back 5 micomort. Only if you leave what I would call the 'Newtonian regime of probability', e.g. by somehow managing to travel 1Tm with the railway, you are required to do proper probability math, because naive addition would tell you that you will surely have a fatal accident (1 mort) in that distance, which is clearly wrong.
So, travelling 1Tm with the railway you have a 63% chance of dying according to the math in the post
The excellent book Algorithms to Live By has an entire chapter dedicated to this concept, using the secretary problem as an example: https://www.amazon.com/Algorithms-Live-Computer-Science-Decisions/dp/1627790365
It's worth noting the mathematically fairly simple and obvious but intuitively annoying fact that if you've tried the 10% chance 9 times with no success, you do not have a 63% chance of succeeding on your next attempt.
However it is true that doing something with a 10% success rate 10 times will net you an average of 1 success.
For the easier to work out case of doing something with a 50% success rate 2 times:
Gives an average of 1 success.
Of course this only matters for the sort of thing where 2 successes is better than 1 success:
EDIT: To clarify, a 10% chance of finding a monogamous partner 10 times yields 1.00 successful dates and 0.63 monogamous partners that you end up with, in expectation.
Why would is the expectation to find a polyamorous partner be higher in the case you gave? Same chance per try and same number of tries should equal same expectation.
If you're monagamous and go to multiple speed dating events and find two potential partners, you end up with one partner. If you're polyamorous and do the same, you end up with two partners.
One way to think of it is whether you will stop trying after the first success. Though that isn't always the distinguishing feature. For example, you might start 10 job interviews at the same time, even though you'll take at most one job.
No, I think you are mixing the probability of at least one success in ten trails (with a 10% chance per trail), which is ~0.65=65%, with the expected value which is n=1 in both cases. You have the same chance of finding 1 partner in each case and you do the same number of trails. There is a 65% chance that you have at least 1 success in the 10 trails for each type of partner. The expected outcome in BOTH cases is 1 as in n=1 not 1 as in 100%
Probability of at least one success: ~65%
Probability of at least two success: ~26%
Probability of at least two success: ~26%
My point is that in some situations, "two successes" doesn't make sense. I picked the dating example because it's cute, but for something more clear cut imagine you're playing Russian Roulette with 10 rounds each with a 10% chance of death. There's no such thing as "two successes"; you stop playing once you're dead. The "are you dead yet" random variable is a boolean, not an integer.
Yes. But I think you have mixed up expected value and expected utility. Please show your calculations.
Sure. For simplicity, say you play two rounds of Russian Roulette, each with a 60% chance of death, and you stop playing if you die. What's the expected value of YouAreDead at the end?
So the expected value of the boolean YouAreDead random variable is 0.84.
Now say you're monogamous and go on two dates, each with a 60% chance to go well, and if they both go well then you pick one person and say "sorry" to the other. Then:
So the expected value of the HowManyPartnersDoYouHave random variable is 0.84, and the expected value of the HowManyDatesWentWell random variable is 0.48+2*0.36 = 1.2.
Now say you're polyamorous and go on two dates with the same chance of success. Then:
So the expected value of both the HowManyPartnersDoYouHave random variable and the HowManyDatesWentWell random variable is 1.2.
Note that I've only ever made statements about expected value, never about utility.
I think what Justin is saying is that finding a single monogamous partner is not significantly different from finding two, three, etc. For some things you only care about succeeding once. So a 63% chance of success (any number of times) means a .63 expected value (because all successes after the first have a value of 0).
Meanwhile for other things, such as polyamorous partners, 2 partners is meaningfully better than one, so the expected value truly is 1, because you will get one partner on average. (Though this assumes 2 partners is twice as good as one, we can complicate this even more if we assume that 2 partners is better, but not twice as good)
Sure:
For a monogamous partner, finding a successful partner has a value of 1
Finding 2 successful partners also has a value of 1, because in a monogamous relationship, you only need one partner.
The same holds for 3, 4, etc partners. All those outcomes also have a value of 1.
So first, let's find the probability of getting a value of 0. Then let's calculate the probability of getting a value of 1.
The probability of getting a value of 0 (not finding a partner):
There is one other mutually exclusive alternative: Finding at least one partner (which has a value of 1)
So we have a 34.9% chance of getting a value of 0 and a 65.1% chance of getting a value of 1. The expected value is:
If you did this experiment a million times and assigned a value of 1 to "getting at least one monogamous partner" and a value of 0 to "getting no monogamous partners," you would get, on average, a reward of 0.651.
For the sake of brevity, I'll skip the calculations for a polygamous partner because we both agree on what the answer should be for that.
I know I am a parrot here, but they are playing two different games. One wants to find One partner and the stop. The other one want to find as many partners as possible. You can not you compare utility across different goals. Yes. The poly person will have higher expected utility, but it is NOT comparable to the utility that the mono person derives.
The wording should have been:
10% chance of finding a monogamous partner 10 times yields 1 monogamous partners in expectation and 0.63 in expected utility.
Not:
10% chance of finding a monogamous partner 10 times yields 0.63 monogamous partners in expectation.
and:
10% chance of finding a polyamorous partner 10 times yields 1 polyamorous partner in expectation and 1 in expected utility.
instead of:
10% chance of finding a polyamorous partner 10 times yields 1.00 polyamorous partners in expectation.
So there was a mix up in expected number of successes and expected utility.
My guesses at what the spoiler was going to be:
Ten non-independent trials, a 10% chance each (in the prior state of knowledge, not conditional on previous results,), and only one trial can succeed. You satisfy these conditions with something like "I hid a ball in one of ten boxes", and the chance really is 100% that one is a "success".
Regardless of whether the trials are independent, the maximum probability that at least one is a success is the sum of the probabilities per trial. In this case that doesn't yield a useful bound because we already know probabilities are below 100%, but in general it's useful.
Yeah, it's cool that "I did n trials, with a 1/n chance each, so the probability of at least one success is... " does have a general answer, even if it's not 100%. Just noting that it's not the only small modification of the title yielding a useful and interesting correct statement.
The ones that came to my mind still involved the sum of the per-trial probabilities. If it was clear that we were looking for something preserving the "n trials with 1/n chance", rather than the summation, I think it would have been more obvious where you're going with this.
Furthermore, the tries must be independent of each other, otherwise the reasoning breaks down completely. If I draw cards from a deck, each one has (a priori) 1/52 chance of being the ace of spades, yet if I draw all 52 I will draw the ace of spades 100% of the time. This is because successive failures increase the posterior probability of drawing a success.
Nitpick: “odds of 63%” sounds to me like it means “odds of 63:100” i.e. “probability of around 39%”. Took me a while to realise this wasn’t what you meant.
Ah, shoot. You're right. Probably not good to use "odds" and "probability" interchangeably for percentages like I did. Should be fixed now.
Years ago when I was hanging out with day traders there was a heuristic they all seemed to hold. If their trading model was producing winning trades two out of three times they thought the model was good and could be used. No one ever suggested why that particular rate was the shared meme/norm -- why not 4 out of 5 or 3 out of 5. I wonder if empirically (or just intuitively over time) they simply approximated the results in this post.
Or maybe just a coincidence, but generally when money is at stake I think the common practices will tend to reflect some fundamental fact of the environment.
I thought to myself... "That looks familiar..."
It is also very similar to the formula for calculating the compound interest rate.
Just swap the minus with a plus and the function tends to e: after all, compounding interest rates was how the constant got known in the first place.
This post reminds me of another estimate. Using the same mistake as a starting point it can be phrased like: "It's a p chance which I did n times, so it should be np if np<<1." This is because (1-p)^n = 1 - np + n(n-1)/2 p^2 - ..., and since np<<1 this is approximately 1-np.
However, I find this linearity more useful when combining small changes: a 1% increase followed by a 2% increase is approximately a 3% increase, since (1+p)(1+q)=1+p+q+pq and pq can be ignored in an approximation.
I guess wisdom is about understanding, by observation, how "the wheels" role in the machine. You're probably right but you always need to test different solutions. Cybernetics allows you to find out-of-the-box responses on a "plug n play" logic.
Many of you readers may instinctively know that this is wrong. If you flip a coin (50% chance) twice, you are not guaranteed to get heads. The probability of getting a heads is 75%. However you may be surprised to learn that there is some truth to this statement; modifying the statement just slightly will yield not just a true statement, but a useful and interesting one.
It's a spoiler, though. If you want to figure this out as you read this article yourself, you should skip this and then come back. Ok, ready? Here it is:
It's a 1/n chance and I did it n times, so the probability should be... 63%.
Almost always.
The math:
Suppose you're flipping a coin and you want to find the probability of NOT flipping a single heads in a dozen flips. The math for this is fairly simple: The probability of not flipping a single heads is the same as the probability of flipping 12 tails. which is
(1/2)12≈0.000977The same can be done with this problem: you have something with a 1/10 chance and you want to do it 10 times. The probability of not getting it to happen even once is the same as the probability of it not happening 10 times in a row. So
(9/10)10≈0.35If you learned some fairly basic probability, I doubt this is that interesting to you. The interesting part comes when you look at the general formula: The probability of not getting what you want (I'll call this 1−p, because p would be the probability of the outcome you want) is
1−p=(1−1n)nWhere n in our case is 10, but in general is whatever number you hear when you hear the (incorrect) phrase "It's a one-in-n chance, and I did it n times, so it should be 100%"
Hold on a sec, that formula looks familiar...
"(1−1n)n ..." I thought to myself... "That looks familiar..." This is by no means obvious, but to people who have dealt with the number e recently, this looks quite similar to the limit that actually defines that number. This sort of pattern recognition led me to google what this limit is, and it turns out my intuition was close:
limn→∞(1−1n)n=1e1e≈0.37So it turns out: for any n that's large enough, if you do something with a 1/n chance of success n times, your probability of failure is always going to be roughly 37%, which means your probability of success will always be roughly 63%.
If something is a 1/n chance, and I do it n times, the probability should be... 63%.
Isn't that cool? I think that's cool.
What I'm NOT saying:
There are a couple ways to easily misinterpret this, so here are some caveats:
Spoiler for 5, 10, and 20: it's 67%, 65%, and 64% respectively