You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Unnamed comments on AlphaGo versus Lee Sedol - Less Wrong Discussion

17 Post author: gjm 09 March 2016 12:22PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (183)

You are viewing a single comment's thread. Show more comments above.

Comment author: gjm 10 March 2016 12:43:26PM *  14 points [-]

Ignoring psychology and just looking at the results:

  1. Delta-function prior at p=1/2 -- i.e., completely ignore the first two games and assume they're equally matched. Lee Sedol wins 12.5% of the time.

  2. Laplace's law of succession gives a point estimate of 1/4 for Lee Sedol's win probability now. That means Lee Sedol wins about 1.6% of the time. [EDITED to add:] Er, no, actually if you're using the rule of succession you should apply it afresh after each game, and then the result is the same as with a uniform prior on [0,1] as in #3 below. Thanks to Unnamed for catching my error.

  3. Uniform-on-[0,1] prior for Lee Sedol's win probability means posterior density is f(p)=3(1-p)^2, which means he wins the match exactly 5% of the time.

  4. I think most people expected it to be pretty close. Take a prior density f(p)=4p(1-p), which favours middling probabilities but not too outrageously; then he wins the match about 7.1% of the time.

So ~5% seems reasonable without bringing psychological factors into it.

Comment author: Unnamed 10 March 2016 11:41:04PM 7 points [-]

Laplace's law of succession gives Lee Sedol a 5% chance of winning the match (and AlphaGo a 50% chance of a 5-0 sweep). It gives him a 1/4 chance of winning game 3, a 2/5 chance of winning game 4 conditional on winning game 3, and a 1/2 chance of winning game 5 conditional on winning games 3&4. It's important to keep updating the probability after each game, because 1/4 is just a point estimate for a distribution of true win probabilities and the cases where he wins game 3 tend to come from the part of the distribution where his true win probability is larger than 1/4. It is not a coincidence that Laplace's law (with updating) gives the same result as #3 - Laplace's law can be derived from assuming a uniform prior.

Comment author: gjm 10 March 2016 11:59:31PM 5 points [-]

Hmm, I explicitly considered whether using LLS we should update after each new game and decided it was a mistake, but on reflection you're right. (Of course what's really right is to have an actual prior and do Bayesian updates, which is one reason why I didn't consider at greater length and maybe get the right answer :-).)

Sorry about that.