I've talked earlier about integral and differential ethics, in the context of population ethics. The idea is that the argument for the repugnant conclusion (and its associate, the very repugnant conclusion) is dependent on a series of trillions of steps, each of which are intuitively acceptable (adding happy people, making happiness more equal), but reaching a conclusion that is intuitively bad - namely, that we can improve the world by creating trillions of people in torturous and unremitting agony, as long as balance it out by creating enough happy people as well.
Differential reasoning accepts each step, and concludes that the repugnant conclusions are actually acceptable, because each step is sound. Integral reasoning accepts that the repugnant conclusion is repugnant, and concludes that some step along the way must therefore be rejected.
Notice that key word, "therefore". Some intermediate step is rejected, but not for intrinsic reasons, but purely because of the consequence. There is nothing special about the step that is rejected, it's just a relatively arbitrary barrier to stop the process (compare with the paradox of the heap).
Indeed, things can go awry when people attempt to fix the repugnant conclusion (a conclusion they rejected through integral reasoning) using differential methods. Things like the "person-affecting view" have their own ridiculousness and paradoxes (it's ok to bring a baby into the world if it will have a miserable life; we don't need to care about future generations if we randomise conceptions, etc...) and I would posit that it's because they are trying to fix global/integral issues using local/differential tools.
The relevance of this? It seems that integral tools might be better suited to deal with the bad convergence of AI problem. We could set up plausibly intuitive differential criteria (such as self-consistency), but institute overriding integral criteria that can override these if they go too far. I think there may be some interesting ideas in that area, potentially. The cost is that integral ideas are generally seen as less elegant, or harder to justify.
I suspect much of the problem is that humans aren't very good at consistency nor calculation. Scope insensitivity (and other errors) cause us to accept steps that lead to incorrect results once aggregated. If you can actually define your units and measurements, I strongly expect that the sum of the steps will equal the conclusion, and you will be able to identify the steps which are unacceptible (or accept the conclusion).
I'd advise against the motivated reasoning that "if you don't like the conclusion, you have to find a step to reject", but rather the "I notice I'm confused that I have different evaluations of the steps and the aggregate, so I've probably miscalculated."
And if this is the case (the mismatch is caused by compounded rounding errors rather than a fundamental disconnect), then it seems unlikely to be a useful solution to AI problems. Unless we fix the problems in our calculation, not just use the method we've proven doesn't work.
But then you have to choose either to correct the steps to get in line with the aggregate (integral reasoning), or the aggregate to get in line with the steps (differential reasoning).