Consider Newcomb's problem.
Let 'general' be the claim that Omega is always right.
Let 'instance' be the claim that Omega is right about a particular prediction.
Assume you, the player, are not told the rules of the game until after Omega has made its prediction.
Consider 2 variants of Newcomb's problem.
1. Omega is a perfect predictor. In this variant, you assign a prior of 1 to P(general). You are then obligated to believe that Omega has correctly predicted your action. In this case Eliezer's conclusion is correct, and you should one-box. It's still unclear whether you have free will, and hence have any choice in what you do next, but you can't lose by one-boxing.
But you can't assign a prior of 1 to P(general), because you're a Bayesian. You derive your prior for P(general) from the (finite) empirical data. Say you begin with a prior of 0.5 before considering any observations. Then you observe all of Omega's N predictions, and each time, Omega gets it right, and you update:
P(general | instance) = P(instance | general) P(instance) / P(general)
= P(instance) / P(general)
Omega would need to make an infinite number of correct predictions before you could assign a prior of 1 to P(general). So this case is theoretically impossible, and should not be considered.
2. Omega is a "nearly perfect" predictor. You assign P(general) a value very, very close to 1. You must, however, do the math and try to compare the expected payoffs, at least in an order-of-magnitude way, and not just use verbal reasoning as if we were medieval scholastics.
The argument for two-boxing is that your action now can't affect what Omega did in the past. That is, we are using a model which includes not just P(instance | general), but also the interaction of your action, the contents of the boxes, and the claim that Omega cannot violate causality. P ( P($1M box is empty | you one-box) = P($1M box is empty | you two-box) ) >= P(Omega cannot violate causality), and that needs to be entered into the computation.
Numerically, two-boxers claim that the high probability they assign to our understanding of causality being basically correct more than cancels out the high probability of Omega being correct.
The argument for one-boxing is that you aren't entirely sure you understand physics, but you know Omega has a really good track record--so good that it is more likely that your understanding of physics is false than that you can falsify Omega's prediction. This is a strict reliance on empirical observations as opposed to abstract reason: count up how often Omega has been right and compute a prior.
However, if we're going to be strict empiricists, we should double down on that, and set our prior on P(cannot violate causality) strictly empirically--based on all observations regarding whether or not things in the present can affect things in the past.
This includes up to every particle interaction in our observable universe. The number is not so high as that, as probably a large number of interactions could occur in which the future affects the past without our noticing. But the number of observations any one person has made in which events in the future seem to have failed to affect events in the present is certainly very large, and the accumulated wisdom of the entire human race on the issue must provide more bits in favor of the hypothesis that causality can't be violated, than the bits for Omega's infallibility based on the comparatively paltry number of observations of Omega's predictions, unless Omega is very busy indeed. And even if Omega has somehow made enough observations, most of them are as inaccessible to you as observations of the laws of causality working on the dark side of the moon. You, personally, cannot have observed Omega make more correct predictions than the number of events you have observed in which the future failed to affect the present.
You could compute a new payoff matrix that made it rational to one-box, but the ratio between the payoffs would need to be many orders of magnitude higher. You'd have to compute it in utilons rather than dollars, because the utility of dollars doesn't scale linearly. And that means you'd run into the problem that humans have some upper bound on utility--they aren't cognitively complex enough to achieve utility levels 10^10 times greater than "won $1,000". So it still might not be rational to one-box, because the utility payoff under the one box might need to be larger than you, as a human, could experience.
Pre-commitment
The case in which you get to think about what to do before Omega studies you and makes its decision is more complicated, because your probability calculation then also depends on what you think you would have done before Omega made its decision. This only affects the partition of your probability calculation in which Omega can alter the past, however, so numerically it doesn't make a big difference.
The trick here is that most statements of Newcomb's are ambiguous as to whether you are told the rules before Omega studies you, and as to which decision they're asking you about when they ask if you one-box or two-box. Are they asking about what you pre-commit to, or what you eventually do? These decisions are separate, but not isolatable.
As long as we focus on the single decision at the point of action, then the analysis above (modified as just mentioned) still follows. If we ask what the player should plan to do before Omega makes its decision, then the question is just whether you have a good enough poker face to fool Omega. Here it takes no causality violation for Omega to fill the boxes in accordance with your plans, so that factor does not enter in, and you should plan to one-box.
If you are a deterministic AI, that implies that you will one-box. If you're a GOFAI built according to the old-fashioned symbolic logic AI designs talked about on LW (which, BTW, don't work), it implies you will probably one-box even if you're not deterministic, as otherwise you would need to be inconsistent, which is not allowed with GOFAI architectures. If you're a human, you'd theoretically be better off if you could suddenly see things differently when it's time to choose boxes, but that's not psychologically plausible. In no case is there a paradox, or any real difficulty to the decision to one-box.
Iterated Games
Everything changes with iterated interactions. It's useful to develop a reputation for one-boxing, because this may convince people that you will keep your word even when it seems disadvantageous to you. It's useful to convince people that you would one-box, and it's even beneficial, in certain respects, to spread the false belief in the Bayesian community that Bayesians should one-box.
Read Eliezer's post carefully, and I think you'll agree that the reasoning Eliezer gives for one-boxing is not that it is the rational solution to a one-off game--it's that it's a winning policy to be the kind of person who one-boxes. That's not an argument that the payoff matrix of an instantaneous decision favors one-boxing; it's an argument for a LessWrongian morality. It's the same basic argument as that honoring commitments is a good long-term strategy. But the way Eliezer stated it has given many people the false impression that one-boxing is actually the rational choice in an instantaneous one-shot game (and that's the only interpretation which would make it interesting).
The one-boxing argument is so appealing because it offers a solution to difficult coordination problems. It makes it appear that rational altruism and a rational utopia are within our reach.
But this is wishful thinking, not math, and I believe that the social norm of doing the math is even more important than a social norm of one-boxing.
So, at one point in my misspent youth I played with the idea of building an experimental Omega and looked into the subject in some detail.
In Martin Gardiner's writeup on this back in 1973 reprinted in The Night Is Large the essay explained that the core idea still works if Omega can just predict with 90% accuracy.
Your choice of ONE box pays nothing if you're predicted (incorrectly) to two box, and pays $1M if predicted correctly at 90%, for a total EV of $900,000 (== (0.1)0 + (0.9)1,000,000).
Your choice of TWO box pays $1k if you're predicted (correctly) to two box, and pays $1,001,000 if you're predicted to only one box for a total EV of $101k (== 900 + 100,100 == (0.9)1,000 + (0.1)1,001,000).
So the expected profit from one boxing in a normal game, with Omega accuracy of 90% would be $799k.
Also, by adjusting the game's payouts we could hypothetically make any amount of genuine human predictability (even just a reliable 51% accuracy) be enough to motivate one boxing.
The super simplistic conceptual question here is the distinction between two kinds of sincerity. One kind of sincerity is assessed at the time of the promise. The other kind of sincerity is assessed retrospectively by seeing whether the promise was upheld.
Then the standard version of the game tries to put a wedge between these concepts by supposing that maybe an initially sincere promise might be violated by the intervention of something like "free will", and it tries to make this seem slightly more magical (more of a far mode question?) by imagining that the promise was never even uttered, but rather the promise was stolen from the person by the magical mind reading "Omega" entity before the promise was ever even imagined by the person as being possible to make.
One thing that seems clear to me is that if one boxing is profitable but not certain then you might wish you could have done something in the past that would make it clear that you'll one box, so that you land in the part of Omega's calculations where the prediction is easy, rather than being one of the edge cases where Omega really has to work for its brier score.
On the other hand, the setup is also (probably purposefully) quite fishy. The promise that "you made" is originally implicit, and depending on your understanding of the game maybe extremely abstract. Omega doesn't just tell you what it predicted. If you get one box and get nothing and complain then Omega will probably try to twist it around and blame you for its failed prediction. If it all works then you seem to be getting free money, and why is anyone handing out free money?
The whole thing just "feels like the setup for a scam". Like you one box, get a million, then in your glow of positive trust you give some money to their charitable cause. Then it turns out the charitable cause was fake. Then it turns out the million dollars was counterfeit but your donation was real. Sucker!
And yet... you know, parents actually are pretty good at knowing when their kids are telling the truth or lying. And parents really do give their kids a free lunch. And it isn't really a scam, it is just normal life as a mortal human being.
But also in the end, for someone to look their parents in the eyes and promise to be home before 10PM and really mean it for reals at the time of the promise, and then be given the car keys, and then come home at 1AM... that also happens. And wouldn't it be great to just blame that on "free will" and "the 10% of the time that Omega's predictions fail"?
Looping this back around to the larger AGI question, it seems like what we're basically hoping for is to learn how to become a flawless Omega (or at least build some software that can do this job) at least for the restricted case of an AGI that we can give the car keys without fear that after it has the car keys it will play the "free will" card and grind us all up into fuel paste after promising not to.