AlphaGo versus Lee Sedol

gjm

There have been a couple of brief discussions of this in the Open Thread, but it seems likely to generate more so here's a place for it.

The original paper in Nature about AlphaGo.

Google Asia Pacific blog, where results will be posted. DeepMind's YouTube channel, where the games are being live-streamed.

Discussion on Hacker News after AlphaGo's win of the first game.

There have been a couple of brief discussions of this in the Open Thread, but it seems likely to generate more so here's a place for it.

The original paper in Nature about AlphaGo.

Google Asia Pacific blog, where results will be posted. DeepMind's YouTube channel, where the games are being live-streamed.

Discussion on Hacker News after AlphaGo's win of the first game.

not at 79 (when you can't accidentally prune 78 because it's already on the board

Of course, but I can't remember which was the other very low-probability move, so perhaps it was one of the later moves in that sequence?

I don't recall much detail about AG, but I thought the training it did was to improve the policy net? If the policy net was only trained on amateurs, what was it learning from self-play?

I thought the self-play only trained the value net (because they want it to predict human moves, not its own moves), but I might be remembering incorrectly. Pity that the paper is behind a paywall.

30

AlphaGo versus Lee Sedol

30

30

30

AlphaGo versus Lee Sedol

30

30