But all of these things have an evaluation system in place that still comes back with a success/failure evaluation that serves as a reward/punishment system. They're different ways to use evaluative processes, but they all have pursuit of some kind positive feedback from evaluating a strategy or outcome as successful. His reinforcement learning should be called reinforcement teaching because in that one, humans are explicitly and directly in charge of the reward process whereas in the others the reward process happens more or less internally according to something that should be modifiable once the AI is sufficiently advanced.
But all of these things have an evaluation system in place that still comes back with a success/failure evaluation that serves as a reward/punishment system.
The space between the normal text and the bold text is where your mistake begins. Although it's counterintuitive, there's no reason to make that leap. Minds-in-general can discover and understand that things are correct or incorrect without correctness being 'good' and incorrectness being 'bad.'
Part 1 was previously posted and it seemed that people likd it, so I figured that I should post part 2 - http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-2.html