I'd just like to point out that the construction of this article is an exact ascension into the meta of the construction of the problem within it (except that nobody is paying you the utility you spend on finding the highest-high metastrategy back). This is suggestive of the fact that you need a generalized solution to the strategy-picking problem that picks itself. (If it didn't pick itself, it wouldn't be fully general, or it would be making worse choices than some other strategy, inwhichcase it's insufficiently maximal.) A defined (but not necessarily finite) set of strategies which pick among themselves would also be satisfactory, provided that any one of that set of strategies converges in finite time to the maximal among that set of strategies.
Which if you think about it, is exactly the problem to be solved to produce self-modifying FAI; finding the set of strategy-selecting strategies which select strategies only within their own set, and for which each member of that set converges on the best strategy within that set. (Although FAI doesn't necessarily have to do it in finite time.)
In an earlier post, I talked about how we could deal with variants of the Heaven and Hell problem - situations where you have an infinite number of options, and none of them is a maximum. The solution for a (deterministic) agent was to try and implement the strategy that would reach the highest possible number, without risking falling into an infinite loop.
Wei Dai pointed out that in the cases where the options are unbounded in utility (ie you can get arbitrarily high utility), then there are probabilistic strategies that give you infinite expected utility. I suggested you could still do better than this. This started a conversation about choosing between strategies with infinite expectation (would you prefer a strategy with infinite expectation, or the same plus an extra dollar?), which went off into some interesting directions as to what needed to be done when the strategies can't sensibly be compared with each other...
Interesting though that may be, it's also helpful to have simple cases where you don't need all these subtleties. So here is one:
Omega approaches you and Mrs X, asking you each to name an integer to him, privately. The person who names the highest integer gets 1 utility; the other gets nothing. In practical terms, Omega will reimburse you all utility lost during the decision process (so you can take as long as you want to decide). The first person to name a number gets 1 utility immediately; they may then lose that 1 depending on the eventual response of the other. Hence if one person responds and the other doesn't, they get the 1 utility and keep it. What should you do?
In this case, a strategy that gives you a number with infinite expectation isn't enough - you have to beat Mrs X, but you also have to eventually say something. Hence there is a duel of (likely probabilistic) strategies, implemented by bounded agents, with no maximum strategy, and each agent trying to compute the maximal strategy they can construct without falling into a loop.