You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

jacob_cannell comments on [Link] AlphaGo: Mastering the ancient game of Go with Machine Learning - Less Wrong Discussion

14 Post author: ESRogs 27 January 2016 09:04PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (122)

You are viewing a single comment's thread. Show more comments above.

Comment author: bogus 28 January 2016 06:17:42PM *  12 points [-]

How big a deal is this? What, if anything, does it signal about when we get smarter than human AI?

It shows that Monte-Carlo tree search meshes remarkably well with neural-network-driven evaluation ("value networks") and decision pruning/policy selection ("policy networks"). This means that if you have a planning task to which MCTS can be usefully applied, and sufficient data to train networks for state-evaluation and policy selection, and substantial computation power (a distributed cluster, in AlphaGo's case), you can significantly improve performance on your task (from "strong amateur" to "human champion" level). It's not an AGI-complete result however, any more than Deep-Blue or TD-gammon were AGI-complete.

The "training data" factor is a biggie; we lack this kind of data entirely for things like automated theorem proving, which would otherwise be quite amenable to this 'planning search + complex learned heuristics' approach. In particular, writing provably-correct computer code is a minor variation on automated theorem proving. (Neural networks can already write incorrect code, but this is not good enough if you want a provably Friendly AGI.)

Comment author: jacob_cannell 29 January 2016 05:48:26PM 4 points [-]

Humans need extensive training to become competent, as will AGI, and this should have been obvious for anyone with a good understanding of ML.