The exponential is because updates happen on a logarithmic scale. Do you have a simple variant of the problem in mind where we don't get exponentials? When I try to construct one, I have to start from "we don't get exponentials" and calculate how the probabilities of different hypotheses would have to converge over time.
If you're interested in making a follow-up post, I'd enjoy an analysis of the possibilities when the coin is not fair but is also not double sided. For example, if a coin has a 75% chance of turning up heads, how does the probability look? If a coin turns up heads 50 times in a row, it's probably neither fair nor a 75/25 coin, but if it turns up heads 10 times in a row I might guess it to be 75/25.
If you’re interested in making a follow-up post, I’d enjoy an analysis of the possibilities when the coin is not fair but is also not double sided. For example, if a coin has a 75% chance of turning up heads, how does the probability look?
I wrote this! The graphs of P(bias|flips)
are fun. See this post starting at "computing a credible interval":
https://justinpombrio.net/2021/02/19/confidence-intervals.html
Sorry if you're viewing on mobile, I need to fix my styling.
A string of all-heads makes "the coin always flips heads" more likely than any other option, given equal priors, no matter how long the string is. So, what is your prior distribution of bias for "a coin someone tells you to flip"? I'd say 1000:10:1:.001 for fair:biased a tiny but detectable amount:always heads:any other bias amount
I've read that it's not possible to bias a coin - you can bias a coin toss if you know which way up it starts, but the coin itself will always be fair. But I confess that I don't know what assumptions they were making, so for all I know you could make something that would be recognizably a coin but that analysis wouldn't apply.
If one side is heavier, it will land that side down more often. You can see this with a household experiment of gluing a quarter to a circle of cardboard the same thickness, and then flipping it.
So I was thinking of this paper (pdf), which I misremembered somewhat - you can't make a coin biased for "toss and catch", but you can make it biased for "toss and let it bounce". (And for "spin on a table".) Given that, "can't bias a coin" is probably too strong, though it's in the title of the paper.
Props for suggesting an actual experiment! I didn't feel like doing it though :p
Suppose you flip a coin n times and get n heads in a row. What is the probability the next flip will land heads?
Suppose the coin is either a fair coin with one heads or a trick coin with two heads. Let X denote our training data of n heads. We want to find P(fair|X) and P(trick|X). Let P(trick)=ϵ. It follows that P(fair)=1−ϵ.
We use Bayes theorem P(A|B)=P(B|A)P(A)P(B).
P(fair|X)=P(X|fair)P(fair)P(X)=P(X|fair)P(fair)P(fair)P(X|fair)+P(trick)P(X|trick)=2−n(1−ϵ)(1−ϵ)2−n+ϵ1=(1−ϵ)(1−ϵ)+2nϵ
P(trick|X)=P(X|trick)P(trick)P(X)=P(X|trick)P(trick)P(trick)P(X|trick)+P(fair)P(X|fair)=1ϵϵ1+(1−ϵ)2−n=ϵϵ+(1−ϵ)2−n=2nϵ(1−ϵ)+2nϵ
I have flipped perhaps a hundred coins. One was double-headed trick coin. On the one hand, I flip trick coins with anomalously high frequency. On the other hand, double-headed trick coins are more likely than regular coins to get flipped than fair coins. I estimate the flip frequency of double-headed trick coins to be one in ten thousand.
ϵ=trick coinstotal coins=110,000=10−4
What does it look like when we graph our probabilities with ϵ=10−4?
For the first 5 heads you can remain confident you are flipping a regular coin. Around 10 heads the exponential takes off. You quickly become confident you are not flipping a regular coin. At 20 heads in a row you can be confident you are not flipping a regular coin.
The Inflection Point
The inflection point occurs when the probabilities are equal.
P(fair|X)=P(trick|X)(1−ϵ)(1−ϵ)+2nϵ=2nϵ(1−ϵ)+2nϵ1−ϵ=2nϵ1ϵ−1=2n1ϵ=2n+1ϵ=12n+1
A linear increase in your data has predictive power equal to an exponential increase in the strength of your prior.