[Link] AlphaGo: Mastering the ancient game of Go with Machine Learning

ESRogs

DeepMind's go AI, called AlphaGo, has beaten the European champion with a score of 5-0. A match against top ranked human, Lee Se-dol, is scheduled for March.

Games are a great testing ground for developing smarter, more flexible algorithms that have the ability to tackle problems in ways similar to humans. Creating programs that are able to play games better than the best humans has a long history

[...]

But one game has thwarted A.I. research thus far: the ancient game of Go.

http://googleresearch.blogspot.com/2016/01/alphago-mastering-ancient-game-of-go.html

DeepMind's go AI, called AlphaGo, has beaten the European champion with a score of 5-0. A match against top ranked human, Lee Se-dol, is scheduled for March.

Games are a great testing ground for developing smarter, more flexible algorithms that have the ability to tackle problems in ways similar to humans. Creating programs that are able to play games better than the best humans has a long history

[...]

But one game has thwarted A.I. research thus far: the ancient game of Go.

http://googleresearch.blogspot.com/2016/01/alphago-mastering-ancient-game-of-go.html

I think what this result says is thus: "Any tasks humans can do, an AI can now learn to do better, given a sufficient source of training data."

Yes, but that would likely require an extremely large amount of training data because to prepare actions for many kind of situations you'd have an exponential blow up to cover many combinations of many possibilities, and hence the model would need to be huge as well. It also would require high-quality data sets with simple correction signals in order to work, which are expensive to produce.

I think, above all for building a real-time AI you need reuse of concepts so that abstractions can be recombined and adapted to new situations; and for concept-based predictions (reasoning) you need one-shot learning so that trains of thoughts can be memorized and built upon. In addition, the entire network needs to learn somehow to determine which parts of the network in the past were responsible for current reward signals which are delayed and noisy. If there is a simple and fast solutions to this, then AGI could be right around the corner. If not, it could take several decades of research.

In addition, the entire network needs to learn somehow to determine which parts of the network in the past were responsible for current reward signals which are delayed and noisy.

This is a well-known problem, called reinforcement learning. It is a significant component in the reported results. (What happens in practice is that a network's ability to assign "credit" or "blame" for reward signals falls off exponentially with increasing delay. This is a significant limitation, but reinforcement learning is nevertheless very helpful given tight feedback loops.)

24

[Link] AlphaGo: Mastering the ancient game of Go with Machine Learning

24

24

24

[Link] AlphaGo: Mastering the ancient game of Go with Machine Learning

24

24