The subtleties in defining "I" are pushed into the subtleties of defining events X and Y with respect to Clippy and Stapley respectively. I'm not sure if that counts as avoiding it at all.
And there are other issues with utility functions that depend on an agent's impact on utilon-contributing elements. Such as, say, replacing all other agents that provide utilon-contributing elements with subagents of the barriered agent, thus making its own impact equal to the impact of all utilon-contributing agents.
This idea needs work, in other words. Not that you ever said otherwise, I just don't think the formula provided is sufficient for preventing acausal trade without incentivizing undesirable strategies. See this comment as well for my concerns on disincentivizing utility conditional upon nonexistence.
The subtleties in defining "I" are pushed into the subtleties of defining events X and Y with respect to Clippy and Stapley respectively.
Defining events seems much easier than defining identity.
Such as, say, replacing all other agents that provide utilon-contributing elements with subagents of the barriered agent, thus making its own impact equal to the impact of all utilon-contributing agents.
I believe this setup wouldn't have this problem. That's the beauty of using X rather than "non-existence" or something similar, it's "...
A putative new idea for AI control; index here.
Many of the ideas presented here require AIs to be antagonistic towards each other - or at least hypothetically antagonistic towards hypothetical other AIs. This can fail if the AIs engage in acausal trade, so it would be useful if we could prevent such things from happening.
Now, I have to admit I'm still quite confused by acausal trade, so I'll simplify it to something I understand much better, an anthropic decision problem.
Staples and paperclips, cooperation and defection
Cilppy has a utility function p, linear in paperclips, while Stapley has a utility function s, linear in staples (and both p and s are normalised to zero with one aditional item adding 1 utility). They are not causally connected, and each must choose "Cooperate" or "Defect". If they "Cooperate", they create 10 copies of the items they do not value (so Clippy creates 10 staples, Stapley creates 10 paperclips). If they choose defect, they create one copy of the item they value (so Clippy creates 1 paperclip, Stapley creates 1 staple).
Assume both agents know these facts, both agents use anthropic decision theories, and both agents are identical apart from their separate locations and distinct utility functions.
Then the outcome is easy: both agents will consider that "cooperate-cooperate" or "defect-defect" are the only two possible options, "cooperate-cooperate" gives them the best outcome, so they will both cooperate. It's a sweet story of cooperation and trust between lovers that never agree and never meet.
Breaking cooperation
How can we demolish this lovely agreement? As I often do, I will assume that there is some event X that will turn Clippy on, with P(X) ≈ 1 (hence P(¬X) << 1). Similarly there is an event Y that turns Stapley on. Since X and Y are almost certain, they should not affect the results above. If the events don't happen, the AIs will never get turned on at all.
Now I am going to modify utility p, replacing it with
p' = p - E(p|¬X).
This p with a single element subtracted off it, the expected value of p given that Clippy has not been turned on. This term feels like a constant, but isn't exactly, as we shall see. Do the same modification to utility s, using Y:
s' = s - E(s|¬Y).
Now contrast "cooperate-cooperate" and "defect-defect". If Clippy and Stapley are both cooperators, then p=s=10. However, if the (incredibly unlikely) ¬X were to happen, then Clippy would not exist, but Stapley would still cooperate (as Stapley has no way of knowing about Clippy's non-existence), and create ten paperclips. So E(p|¬X) = E(p|X) ≈ 10, and p' ≈ 0. Similarly s' ≈ 0.
If both agents are defectors, though, then p=s=1. Since each agent creates its own valuable object, E(p|¬X) = 0 (Clippy cannot create a paperclip if Clippy does not exist) and similarly E(s|¬Y)=0.
So p'=s'=1, and both agents will choose to defect.
If this is a good analogue for acausal decision making, it seems we can break that, if needed.