Follow up to Deterministic Strategies Can Be Sub-optimal
The Ultimatum Game is a simple experiment. Two people have been allocated $10. One person decides how to divide the profits, and the other decides whether to Accept that allocation or to Deny it, in which case both participants get $0. Suppose you are the person whose job it is to choose whether to Accept or Deny an offer. What strategy could you use to maximize your returns?
Yudkowsky offers the following solution (NB: the original text splits $12, because sci-fi; I have changed the numbers inline/without brackets, let me know if that offends)
> It goes like this:
>
> When somebody offers you a 6:4 split, instead of the 5:5 split that would be fair, you should accept their offer with slightly less than 5/6 probability. Their expected value from offering you 6:4, in this case, is 6 * slightly less than 5/6, or slightly less than 5. This ensures they can't do any better by offering you an unfair split; but neither do you try to destroy all their expected value in retaliation. It could be an honest mistake, especially if the real situation is any more complicated than the original Ultimatum Game.
>
> If they offer you 7:3, accept with probability slightly-more-less than 5/7, so they do even worse in their own expectation by offering you 7:3 than 6:4.
>
> It's not about retaliating harder, the harder they hit you with an unfair price - that point gets hammered in pretty hard to the kids, a Watcher steps in to repeat it. The circumstances under which you should ever go around carrying out counterfactual threats in real life are much more fraught and complicated than this, and nobody's going to learn about them realistically for several years yet. This setup isn't about retaliation, it's about what both sides have to do, to turn the problem of dividing the gains, into a matter of fairness; to create the incentive setup whereby both sides don't expect to do any better by distorting their own estimate o