Lai has created an artificial intelligence machine called Giraffe that has taught itself to play chess by evaluating positions much more like humans and in an entirely different way to conventional chess engines.
Straight out of the box, the new machine plays at the same level as the best conventional chess engines, many of which have been fine-tuned over many years. On a human level, it is equivalent to FIDE International Master status, placing it within the top 2.2 percent of tournament chess players.
The technology behind Lai’s new machine is a neural network. [...] His network consists of four layers that together examine each position on the board in three different ways.
The first looks at the global state of the game, such as the number and type of pieces on each side, which side is to move, castling rights and so on. The second looks at piece-centric features such as the location of each piece on each side, while the final aspect is to map the squares that each piece attacks and defends.
[...]
Lai generated his dataset by randomly choosing five million positions from a database of computer chess games. He then created greater variety by adding a random legal move to each position before using it for training. In total he generated 175 million positions in this way.
[...]
One disadvantage of Giraffe is that neural networks are much slower than other types of data processing. Lai says Giraffe takes about 10 times longer than a conventional chess engine to search the same number of positions.
But even with this disadvantage, it is competitive. “Giraffe is able to play at the level of an FIDE International Master on a modern mainstream PC,” says Lai. By comparison, the top engines play at super-Grandmaster level.
[...]
Ref: arxiv.org/abs/1509.01549 : Giraffe: Using Deep Reinforcement Learning to Play Chess
http://www.technologyreview.com/view/541276/deep-learning-machine-teaches-itself-chess-in-72-hours-plays-at-international-master/
H/T http://lesswrong.com/user/Qiaochu_Yuan
This is pitched as unsupervised from scratch, but on page 18, it talks about training to mimic another engine's board evaluation function.
I don't follow how the various networks and training regimes relate to each other. If the supervised training only produced the features listed on 18-19, that doesn't seem like a big deal, because those are very basic chess concepts that exhibit very little expertise. (It did jump out to me that the last item, the lowest-value piece among attackers of a square is much more complicated than any of the other features.) And the point is to make a much better evaluation function and trade off against brute force. But I'm not sure what's going on.
The main training isn't entirely from scratch, either, but makes use of random positions from expert games, implicitly labeled balanced. It would be interesting to know how important that was, and what would happen if the system purely bootstrapped, generating typical positions by playing against itself.
It looks to me as if they did the following: