As someone who used Convolutional Neural Networks in a Master's Thesis, this doesn't surprise me. CNNs are especially well suited to problems that involve two dimensional input where spatial information matters. I especially like that they were willing to go deep and make the net 12 layers deep, as that fits well with some of my own research that seemed to be showing that deeper networks were the way to go in terms of performance efficiency.
It's also quite interesting that they didn't use any pooling layers, which is a break from the traditional way that CNNs are constructed, which usually consists of alternating convolutional and pooling layers. I've actually been curious for some time about whether or not pooling layers were actually necessary, or if you could get away with just using convolutional layers, since the convolutional layers seem to be the ones that actually do the important stuff, while the pooling layers seemed like just a neat way to make the input for the next layer smaller and easier to handle.
Regardless, I'm definitely happy to see CNNs making progress for a problem that seemed intractable not so long ago. Score one more for the neural nets!
Intuitively, it makes sense to me that pooling layers would be useful in image/visual applications, since downsampling an image gives another image that's related to the original one. Downsampling a Go board, OTOH, gives nothing useful. (I mean, it's not devoid of information, but it gives up way more than a proportional amount of information, as compared with downsampling a photograph.)
Interestingly one of the authors Amos Storkey proposes to marry machine learning with information markets.
Ilya is awesome. He keeps breaking benchmarks in a way that causes me to predict that he will keep breaking more benchmarks in the future...
Winning competitions in image recognition is pretty similar to Go (same basic neural net architecture), but he's also been cooking up stuff in natural language understanding and translation with "deep" LSTMs.
From the arXiv:
This approach looks like it could be combined with MCTS. Here's their conclusion:
H/T: Ken Regan
Edit -- see also: Teaching Deep Convolutional Neural Networks to Play Go (also published to the arXiv in December 2014), and Why Neural Networks Look Set to Thrash the Best Human Go Players for the First Time (MIT Technology Review article)