Demis Hassabis has already announced that they'll be working on a Starcraft bot in some interview.
This interview, dated yesterday, doesn't go quite that far - he mentions Starcraft as a possibility, but explicitly says that they won't necessarily pursue it.
...If the series continues this way with AlphaGo winning, what’s next — is there potential for another AI-vs-game showdown in the future?
I think for perfect information games, Go is the pinnacle. Certainly there are still other top Go players to play. There are other games — no-limit poker is very difficult, multiplayer has its challenges because it’s an imperfect information game. And then there are
What is your preferred backup strategy for your digital life?
I meant that for AI we will possibly require high-level credit assignment, e.g. experiences of regret like "I should be more careful in these kinds of situations", or the realization that one particular strategy out of the entire sequence of moves worked out really nicely. Instead it penalizes/enforces all moves of one game equally, which is potentially a much slower learning process. It turns out playing Go can be solved without much structure for the credit assignment processes, hence I said the problem is non-existent, i.e. there wasn't even need to consider it and further our understanding of RL techniques.
"Nonexistent problems" was meant as a hyperbole to say that they weren't solved in interesting ways and are extremely simple in this setting because the states and rewards are noise-free. I am not sure what you mean by the second question. They just apply gradient descent on the entire history of moves of the current game such that expected reward is maximized.
Yes, but as I wrote above, the problems of credit assignment, reward delay and noise are non-existent in this setting, and hence their work does not contribute at all to solving AI.
I think what this result says is thus: "Any tasks humans can do, an AI can now learn to do better, given a sufficient source of training data."
Yes, but that would likely require an extremely large amount of training data because to prepare actions for many kind of situations you'd have an exponential blow up to cover many combinations of many possibilities, and hence the model would need to be huge as well. It also would require high-quality data sets with simple correction signals in order to work, which are expensive to produce.
I think, abov...
I agree. I don't find this result to be any more or less indicative of near-term AI than Google's success on ImageNet in 2012. The algorithm learns to map positions to moves and values using CNNs, just as CNNs can be used to learn mappings from images to 350 classes of dog breeds and more. It turns out that Go really is a game about pattern recognition and that with a lot of data you can replicate the pattern detection for good moves in very supervised ways (one could call their reinforcement learning actually supervised because the nature of the problem gives you credit assignment for free).
Then which blogs do you agree with on the matter of the refugee crisis? (My intent is just to crowd-source some well-founded opinions because I'm lacking one.)
What are your thoughts on the refugee crisis?
There's a whole -osphere full of blogs out there, many of them political. Any of those would be better places to talk about it than LW.
Just speaking of weaknesses of the paperclip maximizer though experiment. I've seen this misunderstanding at least 4 out of 10 times that the thought experiment was brought up.
I think many people intuitively distrust the idea that an AI could be intelligent enough to transform matter into paperclips in creative ways, but 'not intelligent enough' to understand its goals in a human and cultural context (i.e. to satisfy the needs of the business owners of the paperclip factory). This is often due to the confusion that the paperclip maximizer would get its goal function from parsing the sentence "make paperclips", rather from a preprogrammed reward function, for example a CNN that is trained to map the number of paperclips in images to a scalar reward.
I think the problem here is the way the utility function is chosen. Utilitarianism is essentially a formalization of reward signals in our heads. It is a heuristic way of quantifying what we expect a healthy human (one that can raise up and survive in a typical human environment and has an accurate model of reality) to want. All of this only converges roughly to a common utility because we have evolved to have the same needs which are necessarily pro-life and pro-social (since otherwise our species wouldn't be present today).
Utilitarianism crudely abstract...
Why does E. Yudkowsky voice such strong priors e.g. wrt. the laws of physics (many worlds interpretation), when much weaker priors seem sufficient for most of his beliefs (e.g. weak computationalism/computational monism) and wouldn't make him so vulnerable? (With vulnerable I mean that his work often gets ripped apart as cultish pseudoscience.)
My model of him has him having an attitude of "if I think that there's a reason to be highly confident of X, then I'm not going to hide what's true just for the sake of playing social games".
You seem to assume that MWI makes the Sequences more vulnerable; i.e. that there are people who feel okay with the rest of the Sequences, but MWI makes them dismiss it as pseudoscience.
I think there are other things that rub people the wrong way (that EY in general talks about some topics more than appropriate for his status, whether it's about science, philosophy, politics, or religion) and MWI is merely the most convenient point of attack (at least among those people who don't care about religion). Without MWI, something else would be "the most cont...
Because he was building a tribe. (He's done now).
edit: This should actually worry people a lot more than it seems to.
I would love to seem some hard data about correlation between the public interest in science and it's degree of 'cult status' vs. 'open science'.
I mean "only a meme" in the sense, that morality is not absolute, but an individual choice. Of course, there can be arguments why some memes are better than others, that happens during the act of individuals convincing each other of their preferences.
Is it? I think, the act of convincing other people of your preferred state of the world is exactly what justifying morality is. But that action policy is only a meme, as you said, which is individually chosen based on many criteria (including aesthetics, peer-pressure, consistency).
Moral philosophy is a huge topic and it's discourse is not dominated by looking at DNA.
Everyone can choose their preferred state then, at least to the extent it is not indoctrinated or biologically determined. It is rational to invest energy into maintaining or achieving this state (because the state presumably provides you with a steady source of reward), which might involve convincing others of your preferred state or prevent them from threating it (e.g. by putting them into jail). There is likely an absolute truth (to the extent physics is consistent...
What are the implications of that on how we decide what is are the right things to do?
Because then it would argue from features that are built into us. If we can prove the existence of these features with high certainty, then it could perhaps serve as guidance for our decisions.
On the other hand, it is reasonable that evolution does not create such goals because it is an undirected process. Our actions are unrestricted in this regard, and we must only bear the consequences of the system that our species has come up with. What is good is thus decided by consensus. Still, the values we have converged to are shaped by the way we have evolved to behave (e.g. empathy and pain avoidance).
More why doing it is desirable at all. Is it a matter of the culture that currently exists? I mean, is it 'right' to eradicate a certain ethnic group if the majority endorses it?
What is the motivation behind maximizing QUALY? Does it require certain incentives to be present in the culture (endorsement of altruism) or is it rooted elsewhere?
I mean a moral terminal goal. But I guess we would be a large step closer to a solution of the control problem if we could specify such a goal.
What I had in mind is something like this: Evolution has provided us with a state which everyone prefers who is healthy (who can survive in a typical situation in which humans have evolved with high probability) and who has an accurate mental representation of reality. That state includes being surrounded by other healthy humans, so by induction everyone must reach this state (and also help others to reach it). I haven't carefully thought this through, but I just want to give an idea for what I'm looking for.
Is there a biological basis that explains that utilitarianism and preservation of our species should motivate our actions? Or is it a purely selfish consideration: I feel well when others feel well in my social environment (and therefore even dependent on consensus)?
Is that actually the 'strange loop' that Hofstadter writes about?
Here they found dopamine to encode some superposed error signals about actual and counterfactual reward:
http://www.pnas.org/content/early/2015/11/18/1513619112.abstract
Could that be related to priors and likelihoods?
Significance
...There is an abundance of circumstantial evidence (primarily work in nonhuman animal models) suggesting that dopamine transients serve as experience-dependent learning signals. This report establishes, to our knowledge, the first direct demonstration that subsecond fluctuations in dopamine concentration in the human striatum combin
Some helpful links I've collected over the years:
Do Bayesianists strongly believe that the Bayes' theorem accurately describes how the brain changes its latent variables in face of new data? It seems very unlikely to me that the brain keeps track of probability distributions and that they sum up to one. How do Bayesianists believe this works at the neuronal level?
Ok, so the motivation is to learn templates to do correlation at each image location with. But where would you get the idea from to do the same with the correlation map again? That seems non-obvious to me. Or do you mean biological vision?
I find CNNs a lot less intuitive than RNNs. In which context was training many filters and successively apply pooling and again filters to smaller versions of the output an intuitive idea?
Could one say that the human brain works best if it is slightly optimistically biased, just enough to have benefits of the neuromodulation accompanied with positive thinking, but not so much that false expectations have a significant potential to severely disappoint you? Are there some recommended sequences/articles/papers on this matter?
Perhaps the conditions that cause the Fermi paradox are actually crucial for life. If spaceflight was easy, all resources would have been exhausted by exponential growth pretty quickly. This would invalidate the 'big distances' point as evidence for a non-streamlined universe, though.
If we are in a simulation, why isn’t the simulation more streamlined? I have a couple of examples for that:
It seems that our simulation hosts would need to have ac...
Quite good Omega Tau interview on failure modes of mega projects: http://omegataupodcast.net/2015/09/181-why-megaprojects-fail-and-what-to-do-about-it/
Happy Longevity Day!
I would say be flexible as some topics are much more complex than others. I've found that most summaries on this list have a good length.
Perhaps you can revive one of these study groups: https://www.reddit.com/subreddits/search?q=spivak
Cross-posting to all of them might reach some people who are interested.
This Baby Rudin group is currently active: https://www.reddit.com/r/babyrudin/
Deutsch briefly summarized his view on AI risks in this podcast episode: https://youtu.be/J21QuHrIqXg?t=3450 (Unfortunately there is no transcript.)
What are your thoughts on his views apart from what you've touched upon above?