You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

V_V comments on AlphaGo versus Lee Sedol - Less Wrong Discussion

17 Post author: gjm 09 March 2016 12:22PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (183)

You are viewing a single comment's thread. Show more comments above.

Comment author: Vaniver 09 March 2016 02:35:27PM 10 points [-]

Several things I thought were interesting:

  1. The commentator (on the Deepmind channel) calling out several of AlphaGo's moves as conservative. Essentially, it would play an additional stone to settle or augment some group that he wouldn't necessarily have played around. What I'm curious about is how much this reflects an attempt by AlphaGo to conserve computational resources. "I think move A is a 12 point swing, and move B is a 10 point swing, but move B narrows the search tree for future moves in a way that I think will net me at least 2 more points." (It wouldn't be verbalized like that, since it's not thinking verbally, but you can get this effect naturally from the tree search and position evaluator.)

  2. Both players took a long time to play "obvious" moves. (Typically, by this I mean something like a response to a forced move.) 이 sometimes didn't--there were a handful of moves he played immediately after AlphaGo's move--but I was still surprised by the amount of thought that went into some of the moves. This may be typical for tournament play--I haven't watched any live before this.

  3. AlphaGo's willingness to play aggressively and get involved in big fights with 이, and then not lose. I'm not sure that all the fights developed to AlphaGo's advantage, but evidently enough of them did by enough.

  4. I somewhat regret 이 not playing the game out to the end; it would have been nice to know the actual score. (I'm sure estimates will be available soon, if not already.)

Comment author: V_V 09 March 2016 04:29:21PM 7 points [-]

What I'm curious about is how much this reflects an attempt by AlphaGo to conserve computational resources.

If I understand correctly, at least according to the Nature paper, it doesn't explicitly optimize for this. Game-playing software is often perceived as playing "conservatively", this is a general property of minimax search, and in the limit the Nash equilibrium consists of maximally conservative strategies.

but I was still surprised by the amount of thought that went into some of the moves.

Maybe these obvious moves weren't so obvious at that level.

Comment author: Error 09 March 2016 06:16:03PM 3 points [-]

I don't know about that level, but I can think of at least one circumstance where I think far longer than would be expected over a forced move. If I've worked out the forced sequence in my head and determined that the opponent doesn't gain anything by it, but they play it anyway, I start thinking "Danger, Danger, they've seen something I haven't and I'd better re-evaluate."

Most of the time it's nothing and they just decided to play out the position earlier than I would have. But every so often I discover a flaw in the "forced" defense and have to start scrabbling for an alternative.

Comment author: WalterL 09 March 2016 06:34:51PM 4 points [-]

This is very true in Go. If you are both playing down a sequence of moves without hesitation, anticipating a payoff, one of you is wrong (kind of. It's hard to put in words.) It is always worth making double sure that it isn't you.

Comment author: Vaniver 09 March 2016 07:20:34PM 2 points [-]

Maybe these obvious moves weren't so obvious at that level.

Sure. And I'm pretty low as amateurs go--what I found surprising was that there were ~6 moves where I thought "obviously play X," and 이 immediately played X in half of them and spent 2 minutes to play X in the other half of them. It wasn't clear to me if 이 was precomputing something he would need later, or was worried about something I wasn't, or so on.

Most of the time I was thinking something like "well, I would play Y, but I'm pretty unconfident that's the right move" and then 이 or AlphaGo play something that are retrospectively superior to Y, or I was thinking something like "I have only the vaguest sense of what to do in this situation." So I guess I'm pretty well-calibrated, even if my skill isn't that great.