You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

SquirrelInHell comments on AlphaGo versus Lee Sedol - Less Wrong Discussion

17 Post author: gjm 09 March 2016 12:22PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (183)

You are viewing a single comment's thread. Show more comments above.

Comment author: Vaniver 09 March 2016 02:35:27PM 10 points [-]

Several things I thought were interesting:

  1. The commentator (on the Deepmind channel) calling out several of AlphaGo's moves as conservative. Essentially, it would play an additional stone to settle or augment some group that he wouldn't necessarily have played around. What I'm curious about is how much this reflects an attempt by AlphaGo to conserve computational resources. "I think move A is a 12 point swing, and move B is a 10 point swing, but move B narrows the search tree for future moves in a way that I think will net me at least 2 more points." (It wouldn't be verbalized like that, since it's not thinking verbally, but you can get this effect naturally from the tree search and position evaluator.)

  2. Both players took a long time to play "obvious" moves. (Typically, by this I mean something like a response to a forced move.) 이 sometimes didn't--there were a handful of moves he played immediately after AlphaGo's move--but I was still surprised by the amount of thought that went into some of the moves. This may be typical for tournament play--I haven't watched any live before this.

  3. AlphaGo's willingness to play aggressively and get involved in big fights with 이, and then not lose. I'm not sure that all the fights developed to AlphaGo's advantage, but evidently enough of them did by enough.

  4. I somewhat regret 이 not playing the game out to the end; it would have been nice to know the actual score. (I'm sure estimates will be available soon, if not already.)

Comment author: SquirrelInHell 10 March 2016 01:40:49AM 3 points [-]

The commentator (on the Deepmind channel) calling out several of AlphaGo's moves as conservative. Essentially, it would play an additional stone to settle or augment some group that he wouldn't necessarily have played around. What I'm curious about is how much this reflects an attempt by AlphaGo to conserve computational resources. "I think move A is a 12 point swing, and move B is a 10 point swing, but move B narrows the search tree for future moves in a way that I think will net me at least 2 more points."

If the search tree is narrowed, it is narrowed for both players, so why would it be a gain?

Comment author: Vaniver 10 March 2016 01:45:15AM 6 points [-]

If the search tree is narrowed, it is narrowed for both players, so why would it be a gain?

There may be an asymmetry between successful modes of attack and successful modes of defense--if there's a narrow thread that white can win through, and a thick thread that black can threaten through, then white wins computationally by closing off that tree.

But thanks for asking: I was confused somewhat because I was thinking about AI vs. human games, but the AI is trained mostly on human vs. human and AI vs. AI games, neither of which will have the AI vs. human feature. Well, except for bots playing on KGS.

Comment author: Vaniver 21 March 2016 06:22:56PM 0 points [-]

But thanks for asking: I was confused somewhat because I was thinking about AI vs. human games, but the AI is trained mostly on human vs. human and AI vs. AI games, neither of which will have the AI vs. human feature. Well, except for bots playing on KGS.

As it turns out, we learned later that Fan Hui started working with Deepmind on AlphaGo after their match, and played a bunch of games against it as it improved. So it did have a number of AI vs. human training games.