Only the cruise control link is an actual comparison of automation+overseer versus just humans. The rest given are examples of automation+overseer failing but there are of course examples of just humans failing just as badly. Is there any further evidence of this phenomenon? In particular, is there evidence that the total success rate decreases as the success rate of the automation increases?
Well, if you're willing to extend automation to cover automatic pricing from a specific set of equations, then we have the recent financial crisis...
Consider a mixed system, in which an automated system is paired with a human overseer. The automated system handles most of the routine tasks, while the overseer is tasked with looking out for errors and taking over in extreme or unpredictable circumstances. Examples of this could be autopilots, cruise control, GPS direction finding, high-frequency trading – in fact nearly every automated system has this feature, because they nearly all rely on humans "keeping an eye on things".
But often the human component doesn't perform as well as it should do – doesn't perform as well as it did before part of the system was automated. Cruise control can impair driver performance, leading to more accidents. GPS errors can take people far more off course than following maps did. When the autopilot fails, pilots can crash their planes in rather conventional conditions. Traders don't understand why their algorithms misbehave, or how to stop this.
There seems to be three factors at work here:
So, when the automation fails, the overseer is generally dumped into an emergency situation, whose nature they are going to have to deduce, and, using skills that have atrophied, they are going to have to take on the task of the automated system that has never failed before and that they have never had to truly understand.
And they'll typically get blamed for getting it wrong.
Similarly, if we design AI control mechanisms that rely on the presence of a human in the loop (such as tools AIs, Oracle AIs, and, to a lesser extent, reduced impact AIs), we'll need to take the autopilot problem into account, and design the role of the overseer so as not to deskill them, and not count on them being free of error.