Yeah, I already edited out some verbosity. ChatGPT is just trained to hedge too much currently. Should I take out more?
It seems to have distracted a bit from the purpose of the post: that we can define an unobjectionable way to aggregate utilities and have an LLM follow it, while still being useful for its owner.
Just to clarify, the complete equilibrium strategy alluded to here is:
"Play 99 and, if anyone deviates from any part of the strategy, play 100 to punish them until they give in"
Importantly, this includes deviations from the punishment. If you don't join the punishment, you'll get punished. That makes it rational to play 99 and punish deviators.
The point of the Folk Theorems are that the Nash Equilibrium notion has limited predictive power in repeated games like this, because essentially any payoff could be implemented as a similar Nash equilibrium. That do... (read more)