Problem: an overseer won’t see the AI which kills us all thinking about how to kill humans, not because the AI conceals that thought, but because the AI doesn’t think about how to kill humans in the first place. The AI just kills humans as a side effect of whatever else it’s doing.
Analogy: the Hawaii Chaff Flower didn’t go extinct because humans strategized to kill it. It went extinct because humans were building stuff nearby, and weren’t thinking about how to keep the flower alive. They probably weren’t thinking about the flower much at all.

More generally: how and why do humans drive species to extinction? In some cases the species is hunted to extinction, either because it's a threat or because it's economically profitable to hunt. But I would guess that in 99+% of cases, the humans drive a species to extinction because the humans are doing something that changes the species' environment a lot, without specifically trying to keep the species alive. DDT, deforestation, introduction of new predators/competitors/parasites, construction… that’s the sort of thing which I expect drives most extinction.
Assuming this metaphor carries over to AI (similar to the second species argument), what kind of extinction risk will AI pose?
Well, the extinction risk will not come from AI actively trying to kill the humans. The AI will just be doing some big thing which happens to involve changing the environment a lot (like making replicators, or dumping waste heat from computronium, or deciding that an oxygen-rich environment is just really inconvenient what with all the rusting and tarnishing and fires, or even just designing a fusion power generator), and then humans die as a side-effect. Collateral damage happens by default when something changes the environment in big ways.
What does this mean for oversight? Well, it means that there wouldn't necessarily be any point at which the AI is actually thinking about killing humans or whatever. It just doesn't think much about the humans at all, and then the humans get wrecked by side effects. In order for an overseer to raise an alarm, the overseer would have to figure out itself that the AI's plans will kill the humans, i.e. the overseer would have to itself predict the consequences of a presumably-very-complicated plan.
The problem is not that no part of their brain tracks it. It's just that it's not the central reason when describing why they do what they do and not the story they tell to themselves.
I don't think that the problematic actions by AGIs are likely of the nature that they can be described in that fashion. They are more likely to be 4D chess moves where the effects are hard to understand directly.
It might be something like: "In our experiments where doctors are supposed to use the AGI to help them make treatment decisions those doctors regularly overrate their own competency and don't follow the AGI recommendation and as a result patients die unnecessarily. Here's an online course that your doctors could take that would make them understand why it's good to follow AGI recommendations"
Actions like that seem totally reasonable but they increase AGI power in contrast to human power. Economic pressure incentives that power transfer.
I wouldn't expect that we go directly from AGI with human supervision to AGI that kills all humans via a deadly pathogen. We are more likely going from AGI with human supervision to AGI that effectively operates without human supervision. Then in a further step, AGIs that operate without human supervision centralize societal powers on themselves and after a few years, there are no resources for humans left.