Follow up to Deterministic Strategies Can Be Sub-optimal
The Ultimatum Game is a simple experiment. Two people have been allocated $10. One person decides how to divide the profits, and the other decides whether to Accept that allocation or to Deny it, in which case both participants get $0. Suppose you are the person whose job it is to choose whether to Accept or Deny an offer. What strategy could you use to maximize your returns?
Yudkowsky offers the following solution (NB: the original text splits $12, because sci-fi; I have changed the numbers inline/without brackets, let me know if that offends)
It goes like this:
When somebody offers you a 6:4 split, instead of the 5:5 split that would be fair, you should accept their offer with slightly less than 5/6 probability. Their expected value from offering you 6:4, in this case, is 6 * slightly less than 5/6, or slightly less than 5. This ensures they can't do any better by offering you an unfair split; but neither do you try to destroy all their expected value in retaliation. It could be an honest mistake, especially if the real situation is any more complicated than the original Ultimatum Game.
If they offer you 7:3, accept with probability slightly-more-less than 5/7, so they do even worse in their own expectation by offering you 7:3 than 6:4.
It's not about retaliating harder, the harder they hit you with an unfair price - that point gets hammered in pretty hard to the kids, a Watcher steps in to repeat it. The circumstances under which you should ever go around carrying out counterfactual threats in real life are much more fraught and complicated than this, and nobody's going to learn about them realistically for several years yet. This setup isn't about retaliation, it's about what both sides have to do, to turn the problem of dividing the gains, into a matter of fairness; to create the incentive setup whereby both sides don't expect to do any better by distorting their own estimate of what is 'fair'.
To be explicit: assume that you have in some way "locked in" some curve, , which tells you to hit "Accept" with probability when offered to let your conspirator keep dollars out of the 10 you are to split. You want to maximize your expected value, as does your conspirator: so, you should positively incentivize your conspirator to give you money.
Consider the following instantiation of this algorithm:
Note that there are many values for . For now, let's not examine the "greedy" half of the algorithm (where your conspirator is offering you more than they are taking themselves), and model another instantiation:
Note that this maintains a positive incentive for your conspirator to give you more money, while not destroying as much value as the prior algorithm.
I work at a company which does year end performance reviews. I was promoted last year, and am not a particular "flight risk". However, I still want to positively incentive my boss to accurately "rate" me - ie, if I performed above average I would like to be given the rating (and raise) for an above average performance, even if it means increasing exposure to a more flight-prone but poorer performance employee. So I published a curve to my boss demonstrating that I would stay with 100% chance if I got the highest rating I could get, would stay with 90% chance if I got an average rating, would stay with 70% chance if I got below average, and would stay with 50% chance if I got put on a performance improvement plan.
This was received well enough, because I run the Risk Analytics team at a FinTech company, so my entire stack is capable of working with uncertainty. In particular, I highlighted that even an average grade (which would put me in the top 70th percentile) would have me staying with 90% chance, which is above industry attrition rate. I ended up getting an average grade, and rolling a 6 on my d10, so I am staying with my company.
Traditional negotiations work by hemming and hawing. Yudkowsky offers a new solution: publish your flight curve and let your conspirator work towards their own incentives. Increase your legibility so that people don't have to track your subtle indications that you are happy/unhappy with an offer.
Yudkowsky's newest novel is here: https://www.glowfic.com/posts/4582
The particular curve you describe doesn't work - even if someone gave in to your threat entirely, they'd offer you 2/3 of the 10$ (this maximizes their EV at ~1.5$), but then you'd have to reject a third of the time so you'd wind up with an EV of less than 5.
You could definitely fix that particular flaw in your system. And what you'd wind up with is something that gets analyzed a lot like the original game except that you've stolen first player position and are offering something 'unfair'. So as usual for this game, your 'unfair' strategy would work perfectly against a pure-CDT agent (they'll cave to any unfair setup since the confrontational alternative is getting 0), and work against some real humans while other real humans will say screw you. The 'ideal agent', however, does not reward threats (because being the kind of agent who never rewards threats is a pretty good shield against anyone bothering to threaten you in the first place, while being the kind of agent who does reward threats is asking to be threatened). So if you use a strategy like the one you suggest against them, they will compute an offer (or a probability of an offer) such that their EV is maximized subject to your EV being strictly less than 5$: in this way you would be better off if you'd just done the fair thing to begin with.