This is a decent argument against appeasement in the specific Pascal's Mugging case, but I think it falls for the pattern of people being too specific in trying to solve this problem.
Pascal's Mugging is a special case of the phenomenon wherein absolute values of delta-utils are much higher than changes in probabilities. In English, you can always construct a positive expected-utility action simply by increasing the utility since the probability won't go down fast enough, because it can't.
I myself have privately postulated half a dozen 'solutions' to the specific Pascal's Mugging scenario, and I think some of them might actually work for the specific scenario, but none of them resolve the general problem of probabilities not corresponding to utilities. (And I don't want to share them, because explaining what's wrong with them with respect to the specific form of Pascal's Mugging is much more difficult than mentioning them.)
Since no one else in this thread or other threads seem to acknowledge this, I might be wrong.
There seems to be some continuing debate about whether or not it is rational to appease a Pascal Mugger. Some are saying that due to scope insensitivity and other biases, we really should just trust what decision theory + Solomonoff induction tells us. I have been thinking about this a lot and I'm at the point where I think I have something to contribute to the discussion.
Consider the Pascal Mugging "Immediately begin to work only on increasing my utility, according to my utility function 'X', from now on, or my powers from outside the matrix will make minus 3^^^^3 utilons happen to you and yours."
Any agent can commit this Pascal's mugging (PM) against any other agent, at any time. A naive decision-theoretic expected-utility optimizer will always appease the mugger. Consider what the world would be like if all intelligent beings were this kind of agent.
When you see an agent, any agent, your only strategy would be to try to PM it before it PMs you. More likely, you will PM each other simultaneously, in which case the agent which finishes the mugging first 'wins'. If you finish mugging at the same time, the mugger that uses a larger integer in its threat 'wins'. (So you'll use the most compact notation possible and things like, "minus the Busy Beaver function of Graham's number utilons".)
This may continue until every agent in the community/world/universe has been PMed. Or maybe there could be one agent, a Pascal Highlander, who manages to escape being mugged and has his utility function come to dominate...
Except, there is nothing stipulating that the mugging has to be delivered in person. With a powerful radio source, you can PM everyone in your future light-cone unfortunate enough to decode your message, potentially highjacking entire distant civilizations of decision-theory users.
Pascal's mugging doesn't have to be targeted. You can claim to be a Herald of Omega and address your mugging "to whoever receives this transmission"
Another strategy might be to build a self-replicating robot (itself too dumb to be mugged) which has a radio which broadcasts a continuous fully general PM, and send it out into space. Then you commit suicide to avoid the fate of being mugged.
Now consider a hypothetical agent which completely ignores muggers. And mugs them back.
Consider what could happen if we build an AI which is friendly in every possible respect except that it appeases PMers.
To avoid this, you might implement a heuristic that ignores PMs on account of the prior improbability of being able to decide the fate of so many utilons, as Robin Hanson suggested. But an AI using naïve expected utility + SI may well have other failure modes roughly analagous to PM that we won't think of until its too late. You might get agents to agree to pre-commit to ignore muggers, or to kill them, but to me this seems unstable. A bandaid that's not addressing the heart of the issue. I think an AI which can envision itself being PMed repeatedly by every other agent on the planet and still evaluate appeasement as the lesser evil cannot possibly be a Friendly AI, even if it has some heuristic or ad hoc patch that says it can ignore the PM.
Of course there's the possibility that we are in a simulation which is occasionally visited by agents from the mother universe, which really does contain 3^^^^3 utilons/people/dustspecks. I'm not convinced acknowledging this possibility changes anything. There's nothing of value that we, as simulated people, could give our Pascal Mugging simulation overlords. Their only motivation would be as absolute sadistic sociopaths, but if that's the reality of the multiverse, in the long term we're screwed no matter what we do, even with friendly AI. And we certainly wouldn't be in any way morally responsible for their actions.
Edit 1: fixed typos