Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Comment author: Vaniver 23 March 2016 02:39:48PM *  9 points [-]

I wish I had read the ending first; Hall is relying heavily on Deutsch to make his case. Deutsch has come up on LW before, most relevantly here. An earlier comment of mine still seems true: I think Deutsch is pointing in the right direction and diagnosing the correct problems, but I think Deutsch underestimates the degree to which other people have diagnosed the same problems and are working on solutions to address those problems.

Hall's critique is multiple parts, so I'm writing my response part by part. Horizontal lines distinguish breaks, like so:


It starts off with reasoning by analogy, which is generally somewhat suspect. In this particular analogy, you have two camps:

  1. Builders, who build ever-higher towers, hoping that they will one day achieve flight (though they don't know how that will work theoretically).

  2. Theorists, who think that they're missing something, maybe to do with air, and that the worries the builders have about spontaneous liftoff don't make sense, because height doesn't have anything to do with flight.

But note that when it comes to AI, the dividing lines are different. Bostrom gets flak for not knowing the details of modern optimization and machine learning techniques (and I think that flak is well-targeted), but Bostrom is fundamentally concerned about theoretical issues. It's the builders--the Ngs of the world who focus on adding another layer to their tower--who think that things will just work out okay instead of putting effort into ensuring that things will work out okay.

That is, the x-risk argument is the combination of a few pieces of theory: the Orthogonality Thesis, that intelligence can be implemented in silicon (the universality of intelligence), and that there aren't hard constraints to intelligence anywhere near the level of human intelligence.


One paragraphs, two paragraphs, three paragraphs... when are we going to get to the substance?

Okay, 5 paragraphs in, we get the idea that "Bayesian reasoning" is an error. Why? Supposedly he'll tell us later.

The last paragraph is good, as a statement of the universality of computation.


And the first paragraph is one of the core disagreements. Hall correctly diagnoses that we don't understand human thought at the level that we can program it. (Once we do, then we have AGI, and we don't have AGI yet.) But Hall then seems to claim that, basically, unless we're already there we won't know when we'll get there. Which is true but immaterial; right now we can estimate when we'll get there, and what our estimate is determines how we should approach.

And then the latter half of this section is just bad. There's some breakdown in communication between Bostrom and Hall; Bostrom's argument, as I understand it, is not that you get enough hardware and then the intelligence problem solves itself. (This is the "the network has become self-aware!" sci-fi model of AGI creation.) The argument is that there's some algorithmic breakthrough necessary to get to AGI, but that the more hardware you have, the smaller that breakthrough is.

(That is, suppose the root of intelligence was calculating matrix determinants. There are slow ways and fast ways to do that--if you have huge amounts of hardware, coming across Laplace Expansion is enough, but if you have small amounts of hardware, you can squeak by only if you have fast matrix multiplication.)

One major point of contention between AI experts is, basically, how many software breakthroughs we have left until AGI. It could be the case that it's two; it could be the case that it's twenty. If it's two, then we expect it to happen fairly quickly; if it's twenty, then we expect it to happen fairly slowly. This uncertainty means we cannot rule out it happening quickly.

The claim that programs do not engage in creativity and criticism is simply wrong. This is the heart and soul of numerical optimization, metaheuristic programs in particular. Programs are creative and critical beyond the abilities of humans in the narrow domains that we've been able to communicate to those programs, but the fundamental math of creativity and criticism exist (in the terms of sampling from a solution space, especially in ways that make use of solutions that we've already considered, and objective functions that evaluate those solutions). The question is how easily we will be able to scale from well-defined problems (like routing trucks or playing Go) to poorly-defined problems (like planning marketing campaigns or international diplomacy).


Part 4 is little beyond "I disagree with the Orthogonality Thesis." That is, it sees value disagreements as irrationality. Bonus points for declaring Bayesian reasoning false for little reason that I can see besides "Deutsch disagrees with it" (which, I think, is due to Deutsch's low familiarity with the math of causal models, which I think are the solution he correctly thinks is missing with EDT-ish Bayesian reasoning).


Not seeing anything worth commenting on in part 5.


Part 6 includes a misunderstanding of Arrow's Theorem. (Arrow's Theorem is a no-go theorem, but it doesn't rule out the thing Hall thinks it rules out. If the AI is allowed to, say, flip a coin when it's indifferent, Arrow's Theorem no longer applies.)

Comment author: RaelwayScot 25 March 2016 01:13:24AM 0 points [-]

Deutsch briefly summarized his view on AI risks in this podcast episode: https://youtu.be/J21QuHrIqXg?t=3450 (Unfortunately there is no transcript.)

What are your thoughts on his views apart from what you've touched upon above?

Comment author: RaelwayScot 10 March 2016 10:46:47PM *  8 points [-]

Demis Hassabis has already announced that they'll be working on a Starcraft bot in some interview.

Comment author: RaelwayScot 23 February 2016 12:59:22PM *  1 point [-]

What is your preferred backup strategy for your digital life?

Comment author: V_V 28 January 2016 08:32:31PM *  0 points [-]

Reward delay is not very significant in this task, since the task is episodic and fully observable, and there is no time preference, thus you can just play a game to completion without updating and then assign the final reward to all the positions.

In more general reinforcement learning settings, where you want to update your policy during the execution, you have to use some kind of temporal difference learning method, which is further complicated if the world states are not fully observable.

Credit assignment is taken care of by backpropagation, as usual in neural networks. I don't know why RaelwayScot brought it up, unless they meant something else.

Comment author: RaelwayScot 28 January 2016 11:20:24PM 3 points [-]

I meant that for AI we will possibly require high-level credit assignment, e.g. experiences of regret like "I should be more careful in these kinds of situations", or the realization that one particular strategy out of the entire sequence of moves worked out really nicely. Instead it penalizes/enforces all moves of one game equally, which is potentially a much slower learning process. It turns out playing Go can be solved without much structure for the credit assignment processes, hence I said the problem is non-existent, i.e. there wasn't even need to consider it and further our understanding of RL techniques.

Comment author: Vaniver 28 January 2016 07:31:11PM 1 point [-]

Credit assignment and reward delay are nonexistent? What do you think happens when one diffs the board strength of two potential boards?

Comment author: RaelwayScot 28 January 2016 08:39:37PM *  0 points [-]

"Nonexistent problems" was meant as a hyperbole to say that they weren't solved in interesting ways and are extremely simple in this setting because the states and rewards are noise-free. I am not sure what you mean by the second question. They just apply gradient descent on the entire history of moves of the current game such that expected reward is maximized.

Comment author: bogus 28 January 2016 06:47:57PM *  1 point [-]

In addition, the entire network needs to learn somehow to determine which parts of the network in the past were responsible for current reward signals which are delayed and noisy.

This is a well-known problem, called reinforcement learning. It is a significant component in the reported results. (What happens in practice is that a network's ability to assign "credit" or "blame" for reward signals falls off exponentially with increasing delay. This is a significant limitation, but reinforcement learning is nevertheless very helpful given tight feedback loops.)

Comment author: RaelwayScot 28 January 2016 07:16:56PM 0 points [-]

Yes, but as I wrote above, the problems of credit assignment, reward delay and noise are non-existent in this setting, and hence their work does not contribute at all to solving AI.

Comment author: moridinamael 28 January 2016 03:30:48PM 2 points [-]

I think what this result says is thus: "Any tasks humans can do, an AI can now learn to do better, given a sufficient source of training data."

Games lend themselves to auto-generation of training data, in the sense that the AI can at the very least play against itself. No matter how complex the game, a deep neural net will find the structure in it, and find a deeper structure than human players can find.

We have now answered the question of, "Are deep neural nets going to be sufficient to match or exceed task-specific human performance at any well-specified task?" with "Yes, they can, and they can do it better and faster than we suspected." The next hurdle - which all the major companies are working on - is to create architectures that can find structure in smaller datasets, less well-tailored training data, and less well-specified tasks.

Comment author: RaelwayScot 28 January 2016 06:15:43PM *  1 point [-]

I think what this result says is thus: "Any tasks humans can do, an AI can now learn to do better, given a sufficient source of training data."

Yes, but that would likely require an extremely large amount of training data because to prepare actions for many kind of situations you'd have an exponential blow up to cover many combinations of many possibilities, and hence the model would need to be huge as well. It also would require high-quality data sets with simple correction signals in order to work, which are expensive to produce.

I think, above all for building a real-time AI you need reuse of concepts so that abstractions can be recombined and adapted to new situations; and for concept-based predictions (reasoning) you need one-shot learning so that trains of thoughts can be memorized and built upon. In addition, the entire network needs to learn somehow to determine which parts of the network in the past were responsible for current reward signals which are delayed and noisy. If there is a simple and fast solutions to this, then AGI could be right around the corner. If not, it could take several decades of research.

Comment author: V_V 27 January 2016 11:56:25PM *  19 points [-]

His argument proves too much.

You could easily transpose it for the time when Checkers or Chess programs beat professional players: back then the "keystone, foundational aspect" of intelligence was thought to be the ability to do combinatorial search in large solution spaces, and scaling up to AGI was "just" a matter of engineering better heuristics. Sure, it didn't work on Go yet, but Go players were not using a different cortical algorithm than Chess players, were they?

Or you could transpose it for the time when MCTS Go programs reached "dan" (advanced amateur) level. They still couldn't beat professional players, but professional players were not using a different cortical algorithm than advanced amateur players, were they?

AlphaGo succeded at the current achievement by using artificial neural networks in a regime where they are know to do well. But this regime, and the type of games like Go, Chess, Checkers, Othello, etc. represent a small part of the range of human cognitive tasks. In fact, we probably find this kind of board games fascinating precisely because they are very different than the usual cognitive stimuli we deal with in everyday life.

It's tempting to assume that the "keystone, foundational aspect" of intelligence is learning essentially the same way that artificial neural networks learn. But humans can do things like "one-shot" learning, learning from weak supervision, learning in non-stationary environments, etc. which no current neural network can do, and not just because a matter of scale or architectural "details". Researchers generally don't know how to make neural networks, or really any other kind of machine learning algorithm, do these things, except with massive task-specific engineering. Thus I think it's fair to say that we still don't know what the foundational aspects of intelligence are.

Comment author: RaelwayScot 28 January 2016 02:36:35PM 1 point [-]

I agree. I don't find this result to be any more or less indicative of near-term AI than Google's success on ImageNet in 2012. The algorithm learns to map positions to moves and values using CNNs, just as CNNs can be used to learn mappings from images to 350 classes of dog breeds and more. It turns out that Go really is a game about pattern recognition and that with a lot of data you can replicate the pattern detection for good moves in very supervised ways (one could call their reinforcement learning actually supervised because the nature of the problem gives you credit assignment for free).

Comment author: fubarobfusco 26 January 2016 08:52:25PM 6 points [-]

There's a whole -osphere full of blogs out there, many of them political. Any of those would be better places to talk about it than LW.

Comment author: RaelwayScot 26 January 2016 09:02:49PM 3 points [-]

Then which blogs do you agree with on the matter of the refugee crisis? (My intent is just to crowd-source some well-founded opinions because I'm lacking one.)

Comment author: RaelwayScot 26 January 2016 08:12:33PM 1 point [-]

What are your thoughts on the refugee crisis?

View more: Next