In this post I'll try to show a surprising link between two research topics on LW: game-theoretic cooperation between AIs (quining, Loebian cooperation, modal combat, etc) and stable self-modification of AIs (tiling agents, Loebian obstacle, etc).
When you're trying to cooperate with another AI, you need to ensure that its action will fulfill your utility function. And when doing self-modification, you also need to ensure that the successor AI will fulfill your utility function. In both cases, naive utility maximization doesn't work, because you can't fully understand another agent that's as powerful and complex as you. That's a familiar difficulty in game theory, and in self-modification it's known as the Loebian obstacle (fully understandable successors become weaker and weaker).
In general, any AI will be faced with two kinds of situations. In "single player" situations, you're faced with a choice like eating chocolate or not, where you can figure out the outcome of each action. (Most situations covered by UDT are also "single player", involving identical copies of yourself.) Whereas in "multiplayer" situations your action gets combined with the actions of other agents to determine the outcome. Both cooperation and self-modification are "multiplayer" situations, and are hard for the same reason. When someone proposes a self-modification to you, you might as well evaluate it with the same code that you use for game theory contests.
If I'm right, then any good theory for cooperation between AIs will also double as a theory of stable self-modification for a single AI. That means neither problem can be much easier than the other, and in particular self-modification won't be a special case of utility maximization, as some people seem to hope. But on the plus side, we need to solve one problem instead of two, so creating FAI becomes a little bit easier.
The idea came to me while working on this mathy post on IAFF, which translates some game theory ideas into the self-modification world. For example, Loebian cooperation (from the game theory world) might lead to a solution for the Loebian obstacle (from the self-modification world) - two LW ideas with the same name that people didn't think to combine before!
Yeah, bargaining between AIs is a very hard problem and we know almost nothing about it. It will probably have all sorts of deception tactics. But in any case, using bargaining instead of war is still in both AI's common interest, and AIs should be able to achieve common interest.
For example, if A has hidden information that will give it an advantage in war, then B can precommit to giving A more share conditional on seeing it (e.g. by constructing a successor AI that visibly includes the precommitment under A's watch). Eventually the AIs should agree on all questions of fact and disagree only on values, at which point they agree on how the war will likely go, so they skip the war and share the bigger pie according to the war's predicted outcome.
BTW, the book "On thermonuclear war" by Kahn is exactly an attempt to predict the ways of war, negotiations and barging between two presumably rational agents (superpowers). Even an idea to move all resources to new third agent is discussed, as I remember - that is donating all nukes to UN.
How B could see that A has hidden information?
Personally, I feel like you have a mathematically correct, but idealistic and unrealistic model of relations between two perfect agents.