I've been tinkering with the idea of making a top level post on this issue, but figured it would get excessively downvoted. So I'll risk it here.
For any decision theory, isn't there some hypothetical where Omega can say, "I've analyzed your decision theory, and I'm giving you proposition X, such that if you act the way your decision theory believes is optimal, you will lose?" The "Omega scans your brain and tortures you if you're too rational" would be an obvious example of this.
Designing a decision theory around any such problem seems relatively trivial. Recognizing when such a proposition is actually legitimate, on the other hand, seems virtually if not actually impossible. In other words, the evidence one would need about Omega's predictive capacity and honesty is quite staggering. Absent that evidence, you should always two-box. The Counterfactual mugging is even more problematic; the relative chances of running into a trickster versus an honest entity in those circumstances are probably so large that, to the human mind, they may as well be infinite.
If this sense is correct, then designing an agent to be able to accomodate Newcomb's or the Counterfactual mugging would actually be a reduction in its rationality. These events are so phenomenally unlikely to occur that actually executing the behaviour specified for them would almost certainly be a misfiring. The entity would be better off losing the 1/3^^^3 times when it actually encounters Newcomb's, and winning the remaining ~100% of the time.
In other words, for want of a better term, much of the discussion of decision theory seems masturbatory. You have an existing system. Someone thinks of how to create a problem for your existing system. Someone solves said problem. Someone thinks of a new problem. Repeat ad infinitum. The marginal cases of something like Newcomb's so thoroughly lack any practical consequence as to be wholly irrelevant for any actual entity that needs to make decisions.
I'm entirely open to the idea that I'm wrong and that Newcomblike problems occur, or that maybe there is some uber-decision theory that can never be broken by Omega. But if neither of those conditions are satisfied, this seems like something of a waste of mental effort. Of course, if it's fun to discuss despite being essentially useless, that's cool. It's just best not to pretend otherwise.
I think of Omega as a simplified stand-in for other people.
The part about Omega being omniscient and knowably trustworthy isn't solved. But I think the problem of Omega rewarding bizarre irrational behaviour on your part mostly goes away if you assume it's fairly human-like, perhaps following UDT or some other decision theory itself. The human motivation for it posing Newcomb's problem could be that it wants one of the boxes kept closed for some reason, and will reward you for keeping it closed. To make it fit this explanation, Omega should say it doesn't ...
A common background assumption on LW seems to be that it's rational to act in accordance with the dispositions one would wish to have. (Rationalists must WIN, and all that.)
E.g., Eliezer:
And more recently, from AdamBell:
Within academic philosophy, this is the position advocated by David Gauthier. Derek Parfit has constructed some compelling counterarguments against Gauthier, so I thought I'd share them here to see what the rest of you think.
First, let's note that there definitely are possible cases where it would be "beneficial to be irrational". For example, suppose an evil demon ('Omega') will scan your brain, assess your rational capacities, and torture you iff you surpass some minimal baseline of rationality. In that case, it would very much be in your interests to fall below the baseline! Or suppose you're rewarded every time you honestly believe the conclusion of some fallacious reasoning. We can easily multiply cases here. What's important for now is just to acknowledge this phenomenon of 'beneficial irrationality' as a genuine possibility.
This possibility poses a problem for the Eliezer-Gauthier methodology. (Quoting Eliezer again:)
The problem, obviously, is that it's possible for irrational agents to receive externally-generated rewards for their dispositions, without this necessarily making their downstream actions any more 'reasonable'. (At this point, you should notice the conflation of 'disposition' and 'choice' in the first quote from Eliezer. Rachel does not envy Irene her choice at all. What she wishes is to have the one-boxer's dispositions, so that the predictor puts a million in the first box, and then to confound all expectations by unpredictably choosing both boxes and reaping the most riches possible.)
To illustrate, consider (a variation on) Parfit's story of the threat-fulfiller and threat-ignorer. Tom has a transparent disposition to fulfill his threats, no matter the cost to himself. So he straps on a bomb, walks up to his neighbour Joe, and threatens to blow them both up unless Joe shines his shoes. Seeing that Tom means business, Joe sensibly gets to work. Not wanting to repeat the experience, Joe later goes and pops a pill to acquire a transparent disposition to ignore threats, no matter the cost to himself. The next day, Tom sees that Joe is now a threat-ignorer, and so leaves him alone.
So far, so good. It seems this threat-ignoring disposition was a great one for Joe to acquire. Until one day... Tom slips up. Due to an unexpected mental glitch, he threatens Joe again. Joe follows his disposition and ignores the threat. BOOM.
Here Joe's final decision seems as disastrously foolish as Tom's slip up. It was good to have the disposition to ignore threats, but that doesn't necessarily make it good idea to act on it. We need to distinguish the desirability of a disposition to X from the rationality of choosing to do X.