You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

SilentCal comments on [link] New essay summarizing some of my latest thoughts on AI safety - Less Wrong Discussion

14 Post author: Kaj_Sotala 01 November 2015 08:07AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (27)

You are viewing a single comment's thread. Show more comments above.

Comment author: jsteinhardt 03 November 2015 05:07:33PM 3 points [-]

Thanks for writing this; a couple quick thoughts:

For example, it turns out that a learning algorithm tasked with some relatively simple tasks, such as determining whether or not English sentences are valid, will automatically build up an internal representation of the world which captures many of the regularities of the world – as a pure side effect of carrying out its task.

I think I've yet to see a paper that convincingly supports the claim that neural nets are learning natural representations of the world. For some papers that refute this claim, see e.g.

http://arxiv.org/abs/1312.6199 http://arxiv.org/abs/1412.6572

I think the Degrees of Freedom thesis is a good statement of one of the potential problems. Since it's essentially making a claim about whether a certain very complex statistical problem is identifiable, I think it's very hard to know whether it's true or not without either some serious technical analysis or some serious empirical research --- which is a reason to do that research, because if the thesis is true then that has some worrisome implications about AI safety.

Comment author: SilentCal 09 November 2015 07:19:42PM 2 points [-]

http://rocknrollnerd.github.io/ml/2015/05/27/leopard-sofa.html is also relevant--tl;dr Google Photos classifies a leopard-print sofa as a leopard. I think this lends credence to the 'treacherous turn' insofar as it's an example of a classifier seeming to perform well and breaking down in edge cases.

Comment author: jacob_cannell 09 November 2015 07:45:02PM *  4 points [-]

The classifier isn't breaking down - it was trained to do well across the entire training set using a small amount of computation for each inference and a reasonable (larger) amount of computation for training.

Human's fastest recognition capability still takes 100 ms or so, and operating in that mode (rapid visual presentation), human inference accuracy is considerably less capable than modern ANNs - which classify using less time and also around 1000x less neurons/synapses.

I would bet that humans often make similar mistakes in fast recognition. And even if humans don't make this specific mistake, it doesn't matter because they make more total mistakes in other categories.

The fact that humans can do better given considerably more time and enormously more neural resources is hardly surprising (involving more complex multi-step inference steps).

Also, the ImageNet training criterion is not really a good match for human visual intuitions. It assigns the same penalty for mistaking a dog for a cat as it does for mistaking two closely related species of dogs. Humans have a more sensible hierarchical error allocation. This may be something that is relatively easy to improve low-hanging fruit for ANNs, not sure - but someone is probably working on that if it hasn't already been done.

Comment author: jsteinhardt 10 November 2015 03:53:59AM 1 point [-]

Human's fastest recognition capability still takes 100 ms or so, and operating in that mode (rapid visual presentation), human inference accuracy is considerably less capable than modern ANNs.

This doesn't seem right, assuming that "considerably less capable" means "considerably worse accuracy at classifying objects not drawn from ImageNet". Do you have a study in mind that shows this? In either case, I don't think this is strong enough to support the claim that the classifier isn't breaking down --- it's pretty clearly making mistakes where humans would find the answer obvious. I don't think that saying that the ANN answers more quickly is a very strong defense.

Comment author: jacob_cannell 10 November 2015 04:47:56AM *  3 points [-]

Do you have a study in mind that shows this?

Comparing different recognition systems is complex, and it's important to compare apples to apples. CNNs are comparable only to rapid feedforward recognition in the visual system which can be measured with rapid serial presentation experiments. In an untimed test the human brain can use other modules, memory fetches, multi-step logical inferences, etc (all of which are now making their way into ANN systems, but still).

The RSP setup ensures that the brain can only use a single feedforward pass from V1 to PFC, without using more complex feedback and recurrent loops. It forces the brain to use a network configuration similar to what current CNN used - CNNs descend from models of that pathway, after all.

In those test CNNs from 2013 rivaled primate IT cortex representations 1, and 2015 CNNs are even better.

That paper uses a special categorization task with monkeys, but the results generalize to humans as well. There are certainly some mistakes that a CNN will make which a human would not make even with the 150ms time constraint, but the CNNs make less mistakes for the more complex tasks with lots of categories, whereas humans presumably still have lower error for basic recognition tasks (but to some extent that is because researchers haven't focused much on getting to > 99.9% accuracy on simpler recognition tasks).

Comment author: jsteinhardt 10 November 2015 08:37:52AM 2 points [-]

Cool, thanks for the paper, interesting read!

Comment author: Lumifer 09 November 2015 07:35:26PM 1 point [-]

and breaking down in edge cases

Except that from a human point of view a leopard-print sofa isn't an edge case at all.