gjm comments on AlphaGo versus Lee Sedol - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (183)
And looking at how he used up his time much sooner, he was more cautious today. He still lost and probably also took a psychological hit, so now my estimate of chances of Lee Sedol winning the whole match went down to ~5%.
Ignoring psychology and just looking at the results:
Delta-function prior at p=1/2 -- i.e., completely ignore the first two games and assume they're equally matched. Lee Sedol wins 12.5% of the time.
Laplace's law of succession gives a point estimate of 1/4 for Lee Sedol's win probability now. That means Lee Sedol wins about 1.6% of the time. [EDITED to add:] Er, no, actually if you're using the rule of succession you should apply it afresh after each game, and then the result is the same as with a uniform prior on [0,1] as in #3 below. Thanks to Unnamed for catching my error.
Uniform-on-[0,1] prior for Lee Sedol's win probability means posterior density is f(p)=3(1-p)^2, which means he wins the match exactly 5% of the time.
I think most people expected it to be pretty close. Take a prior density f(p)=4p(1-p), which favours middling probabilities but not too outrageously; then he wins the match about 7.1% of the time.
So ~5% seems reasonable without bringing psychological factors into it.
Laplace's law of succession gives Lee Sedol a 5% chance of winning the match (and AlphaGo a 50% chance of a 5-0 sweep). It gives him a 1/4 chance of winning game 3, a 2/5 chance of winning game 4 conditional on winning game 3, and a 1/2 chance of winning game 5 conditional on winning games 3&4. It's important to keep updating the probability after each game, because 1/4 is just a point estimate for a distribution of true win probabilities and the cases where he wins game 3 tend to come from the part of the distribution where his true win probability is larger than 1/4. It is not a coincidence that Laplace's law (with updating) gives the same result as #3 - Laplace's law can be derived from assuming a uniform prior.
Hmm, I explicitly considered whether using LLS we should update after each new game and decided it was a mistake, but on reflection you're right. (Of course what's really right is to have an actual prior and do Bayesian updates, which is one reason why I didn't consider at greater length and maybe get the right answer :-).)
Sorry about that.