Demis Hassabis has already announced that they'll be working on a Starcraft bot in some interview.
What is your preferred backup strategy for your digital life?
I meant that for AI we will possibly require high-level credit assignment, e.g. experiences of regret like "I should be more careful in these kinds of situations", or the realization that one particular strategy out of the entire sequence of moves worked out really nicely. Instead it penalizes/enforces all moves of one game equally, which is potentially a much slower learning process. It turns out playing Go can be solved without much structure for the credit assignment processes, hence I said the problem is non-existent, i.e. there wasn't even need to consider it and further our understanding of RL techniques.
"Nonexistent problems" was meant as a hyperbole to say that they weren't solved in interesting ways and are extremely simple in this setting because the states and rewards are noise-free. I am not sure what you mean by the second question. They just apply gradient descent on the entire history of moves of the current game such that expected reward is maximized.
Yes, but as I wrote above, the problems of credit assignment, reward delay and noise are non-existent in this setting, and hence their work does not contribute at all to solving AI.
I think what this result says is thus: "Any tasks humans can do, an AI can now learn to do better, given a sufficient source of training data."
Yes, but that would likely require an extremely large amount of training data because to prepare actions for many kind of situations you'd have an exponential blow up to cover many combinations of many possibilities, and hence the model would need to be huge as well. It also would require high-quality data sets with simple correction signals in order to work, which are expensive to produce.
I think, above all for building a real-time AI you need reuse of concepts so that abstractions can be recombined and adapted to new situations; and for concept-based predictions (reasoning) you need one-shot learning so that trains of thoughts can be memorized and built upon. In addition, the entire network needs to learn somehow to determine which parts of the network in the past were responsible for current reward signals which are delayed and noisy. If there is a simple and fast solutions to this, then AGI could be right around the corner. If not, it could take several decades of research.
I agree. I don't find this result to be any more or less indicative of near-term AI than Google's success on ImageNet in 2012. The algorithm learns to map positions to moves and values using CNNs, just as CNNs can be used to learn mappings from images to 350 classes of dog breeds and more. It turns out that Go really is a game about pattern recognition and that with a lot of data you can replicate the pattern detection for good moves in very supervised ways (one could call their reinforcement learning actually supervised because the nature of the problem gives you credit assignment for free).
Then which blogs do you agree with on the matter of the refugee crisis? (My intent is just to crowd-source some well-founded opinions because I'm lacking one.)
What are your thoughts on the refugee crisis?
Deutsch briefly summarized his view on AI risks in this podcast episode: https://youtu.be/J21QuHrIqXg?t=3450 (Unfortunately there is no transcript.)
What are your thoughts on his views apart from what you've touched upon above?