Another way to avoid the paradox is to care about other people's satisfaction (more complicated than that, but that's not the point) from their point of view, which encompasses their frame of reference.
Another way perhaps is to restate implementing improvements as soon as possible as maximizing total goodness in (the future of) the universe. Particularly, if an improvement could only be implemented once, but it would be twice as effective tomorrow instead of today, do it tomorrow.
So you're differentiating between properties where the probability of [0 1 2 3] is 1-ɛ while >3 is ɛ and probability distributions where the probability of 0 is 0.01, 1 is 0.003, etc? Got it. The only algorithms that I can think of that require the latter are those that require uniformly random input. I don't think those violate modularity though, as any are of the program that interfaces with that module must provide independently random input (which would be the straightforward way to meet that requirement with an arbitrary distribution).
There's a difference between requiring and being optimized for though, and there are lots of algorithms that are optimized for particular inputs. Sort algorithms are a excellent example, if most of your lists are almost already sorted, there algorithms that are cheaper on average, but might take a long time with a number of rare orderings.
Requiring that the inputs to a piece of software follow some probability distribution is the opposite of being modular.
What? There is very little software that doesn't require inputs to follow some probability distribution. When provided with input that doesn't match that (often very narrow) distribution programs will throw it away, give up, or have problems.
You seem to have put a lot more thought into your other points, could you expand upon this a little more?
I guess I didn't make myself at all clear on that point, I ascribe to both of the above!