http://rocknrollnerd.github.io/ml/2015/05/27/leopard-sofa.html is also relevant--tl;dr Google Photos classifies a leopard-print sofa as a leopard. I think this lends credence to the 'treacherous turn' insofar as it's an example of a classifier seeming to perform well and breaking down in edge cases.
The classifier isn't breaking down - it was trained to do well across the entire training set using a small amount of computation for each inference and a reasonable (larger) amount of computation for training.
Human's fastest recognition capability still takes 100 ms or so, and operating in that mode (rapid visual presentation), human inference accuracy is considerably less capable than modern ANNs - which classify using less time and also around 1000x less neurons/synapses.
I would bet that humans often make similar mistakes in fast recognition. And even ...
New essay summarizing some of my latest thoughts on AI safety, ~3500 words. I explain why I think that some of the thought experiments that have previously been used to illustrate the dangers of AI are flawed and should be used very cautiously, why I'm less worried about the dangers of AI than I used to be, and what are some of the remaining reasons for why I do continue to be somewhat worried.
Backcover celebrity endorsement: "Thanks, Kaj, for a very nice write-up. It feels good to be discussing actually meaningful issues regarding AI safety. This is a big contrast to discussions I've had in the past with MIRI folks on AI safety, wherein they have generally tried to direct the conversation toward bizarre, pointless irrelevancies like "the values that would be held by a randomly selected mind", or "AIs with superhuman intelligence making retarded judgments" (like tiling the universe with paperclips to make humans happy), and so forth.... Now OTOH, we are actually discussing things of some potential practical meaning ;p ..." -- Ben Goertzel