Firstly, if the automation performs flawlessly, the overseers will become complacent, blindly trusting the instruments and failing to perform basic sanity checks. They will have far less procedural understanding of what's actually going on, since they have no opportunity to exercise their knowledge.
There's a related problem in manufacturing whose name I've forgotten, but basically, the less frequent defective parts are, the less likely it is human quality control people will notice defective parts, because their job is more boring and so they're less likely to be paying attention when a defective part does happen. (Conditioned on the part being defective, of course.)
Right, one of the original solutions, though rarely implemented, is to add a steady stream of defective parts to guarantee optimal human attention. These artificially defective parts are marked in a way that lets them to be automatically separated and recycled later, should any slip by the human QA.
Consider a mixed system, in which an automated system is paired with a human overseer. The automated system handles most of the routine tasks, while the overseer is tasked with looking out for errors and taking over in extreme or unpredictable circumstances. Examples of this could be autopilots, cruise control, GPS direction finding, high-frequency trading – in fact nearly every automated system has this feature, because they nearly all rely on humans "keeping an eye on things".
But often the human component doesn't perform as well as it should do – doesn't perform as well as it did before part of the system was automated. Cruise control can impair driver performance, leading to more accidents. GPS errors can take people far more off course than following maps did. When the autopilot fails, pilots can crash their planes in rather conventional conditions. Traders don't understand why their algorithms misbehave, or how to stop this.
There seems to be three factors at work here:
So, when the automation fails, the overseer is generally dumped into an emergency situation, whose nature they are going to have to deduce, and, using skills that have atrophied, they are going to have to take on the task of the automated system that has never failed before and that they have never had to truly understand.
And they'll typically get blamed for getting it wrong.
Similarly, if we design AI control mechanisms that rely on the presence of a human in the loop (such as tools AIs, Oracle AIs, and, to a lesser extent, reduced impact AIs), we'll need to take the autopilot problem into account, and design the role of the overseer so as not to deskill them, and not count on them being free of error.