Right. And the thing is, that if one were to argue that humans are thereby irrational, I would disagree. (Which is to say, I would not assent to defining rationality as constituting, or necessarily containing, adherence to VNM.)
I tentatively agree. The decision system I tend toward modelling an idealised me as having contains an extra level of abstraction in order to generalise the VNM axioms and decision theory regarding utility maximisation principles to something that does allow the kind of system you are advocating (and which I don't consider intrinsically irrational).
Simply put, if instead of having preferences for world-histories you have preferences for probability distributions of world-histories then doing the same math and reasoning gives you an entirely different but still clearly defined and abstractly-consequentialist way of interacting with lotteries. It means the agent is doing a different thing than maximising the mean of utility... it could, in effect, be maximising the mean subject to satisficing on a maximum probability of utility below a value.
It's the way being inherently and coherently risk-averse (and similar non-mean optimisers) would work.
Such agents are coherent. It doesn't matter much whether we call them irrational or not. If that is what they want to do then so be it.
Incidentally, I suspect the axiom I would end up rejecting is continuity (axiom 3), but don't quote me on that
That does seem to be the most likely axiom being rejected. At least that has been my intuition when I've considered how plausible not 'expected' utility maximisers seem to think.
If you believe that science is about describing things mathematically, you can fall into a strange sort of trap where you come up with some numerical quantity, discover interesting facts about it, use it to analyze real-world situations - but never actually get around to measuring it. I call such things "theoretical quantities" or "fake numbers", as opposed to "measurable quantities" or "true numbers".
An example of a "true number" is mass. We can measure the mass of a person or a car, and we use these values in engineering all the time. An example of a "fake number" is utility. I've never seen a concrete utility value used anywhere, though I always hear about nice mathematical laws that it must obey.
The difference is not just about units of measurement. In economics you can see fake numbers happily coexisting with true numbers using the same units. Price is a true number measured in dollars, and you see concrete values and graphs everywhere. "Consumer surplus" is also measured in dollars, but good luck calculating the consumer surplus of a single cheeseburger, never mind drawing a graph of aggregate consumer surplus for the US! If you ask five economists to calculate it, you'll get five different indirect estimates, and it's not obvious that there's a true number to be measured in the first place.
Another example of a fake number is "complexity" or "maintainability" in software engineering. Sure, people have proposed different methods of measuring it. But if they were measuring a true number, I'd expect them to agree to the 3rd decimal place, which they don't :-) The existence of multiple measuring methods that give the same result is one of the differences between a true number and a fake one. Another sign is what happens when two of these methods disagree: do people say that they're both equally valid, or do they insist that one must be wrong and try to find the error?
It's certainly possible to improve something without measuring it. You can learn to play the piano pretty well without quantifying your progress. But we should probably try harder to find measurable components of "intelligence", "rationality", "productivity" and other such things, because we'd be better at improving them if we had true numbers in our hands.