'Yeah, we could maybe have AlphaGo learn everything totally from scratch and reach a superhuman level of knowledge just by playing itself, not using any human games for training material. Of course, reinventing everything that humanity has figured out while playing Go for the last 2,500 years, that's going to take quite a bit of time. Like a few months or so.'
Actually, the AlphaGo algorithm, this is something we’re going to try in the next few months — we think we could get rid of the supervised learning starting point and just do it completely from self-play, literally starting from nothing. It’d take longer, because the trial and error when you’re playing randomly would take longer to train, maybe a few months. But we think it’s possible to ground it all the way to pure learning.
http://www.theverge.com/2016/3/10/11192774/demis-hassabis-interview-alphago-google-deepmind-ai
There have been a couple of brief discussions of this in the Open Thread, but it seems likely to generate more so here's a place for it.
The original paper in Nature about AlphaGo.
Google Asia Pacific blog, where results will be posted. DeepMind's YouTube channel, where the games are being live-streamed.
Discussion on Hacker News after AlphaGo's win of the first game.