In such a case, the median outcome of all agents will be improved if every agent with the option to do so takes that offer, even if they are assured that it is a once/lifetime offer (because presumably there is variance of more than 5 utils between agents).
But the median outcome is losing 5 utils?
Edit: Oh, wait! You mean the median total utility after some other stuff happens (with a variance of more than 5 utils)?
Suppose we have 200 agents, 100 of which start with 10 utils, the rest with 0. After taking this offer, we have 51 with -5, 51 with 5, 49 with 10000, and 49 with 10010. The median outcome would be a loss of -5 for half the agents, a gain of 5 for half, but only the half that would lose could actually get that outcome...
And what do you mean by "the possibility of getting tortured will manifest...
The idea is to compare not the results of actions, but the results of decision algorithms. The question that the agent should ask itself is thus:
"Suppose everyone1 who runs the same thinking procedure like me uses decision algorithm X. What utility would I get at the 50th percentile (not: what expected utility should I get), after my life is finished?"
Then, he should of course look for the X that maximizes this value.
Now, if you formulate a turing-complete "decision algorithm", this heads into an infinite loop. But suppose that "decision algorithm" is defined as a huge table for lots of different possible situations, and the appropriate outputs.
Let's see what results such a thing should give:
The reason why humans will intuitively decline to give money to the mugger might be similar: They imagine not the expected utility with both decisions, but the typical outcome of giving the mugger some money, versus declining to.
1I say this to make agents of the same type cooperate in prisoner-like dilemmas.