This is a thought that occured to me on my way to classes today; sharing it for feedback.
Omega appears before you, and after presenting an arbitrary proof that it is, in fact, a completely trustworthy superintelligence of the caliber needed to play these kinds of games, presents you with a choice between two boxes. These boxes do not contain money, they contain information. One box is white and contains a true fact that you do not currently know; the other is black and contains false information that you do not currently believe. Omega advises you that the the true fact is not misleading in any way (ie: not a fact that will cause you to make incorrect assumptions and lower the accuracy of your probability estimates), and is fully supported with enough evidence to both prove to you that it is true, and enable you to independently verify its truth for yourself within a month. The false information is demonstrably false, and is something that you would disbelieve if presented outright, but if you open the box to discover it, a machine inside the box will reprogram your mind such that you will believe it completely, thus leading you to believe other related falsehoods, as you rationalize away discrepancies.
Omega further advises that, within those constraints, the true fact is one that has been optimized to inflict upon you the maximum amount of long-term disutility for a fact in its class, should you now become aware of it, and the false information has been optimized to provide you with the maximum amount of long-term utility for a belief in its class, should you now begin to believe it over the truth. You are required to choose one of the boxes; if you refuse to do so, Omega will kill you outright and try again on another Everett branch. Which box do you choose, and why?
(This example is obviously hypothetical, but for a simple and practical case, consider the use of amnesia-inducing drugs to selectively eliminate traumatic memories; it would be more accurate to still have those memories, taking the time and effort to come to terms with the trauma... but present much greater utility to be without them, and thus without the trauma altogether. Obviously related to the valley of bad rationality, but since there clearly exist most optimal lies and least optimal truths, it'd be useful to know which categories of facts are generally hazardous, and whether or not there are categories of lies which are generally helpful.)
That's why the problem specified 'long-term' utility. Omega is essentially saying 'I have here a lie that will improve your life as much as any lie possibly can, and a truth that will ruin your life as badly as any truth can; which would you prefer to believe?'
Yes, believing a lie does imply that your map has gotten worse, and rationalizing your belief in the lie (which we're all prone to do to things we believe) will make it worse. Omega has specified that this lie has optimal utility among all lies that you, personally, might believe; being Omega, it is as correct in saying this as it is possible to be.
On the other hand, the box containing the least optimal truth is a very scary box. Presume first that you are particularly strong emotionally and psychologically; there is no fact that will directly drive you to suicide. Even so, there are probably facts out there that will, if comprehended and internalized, corrupt your utility function, leading you to work directly against all you currently believe in. There's probably something even worse than that out there in the space of all possible facts, but the test is rated to your utility function when Omega first encountered you, so 'you change your ethical beliefs, and proceed to spend your life working to spread disutility, as you formerly defined it' is on the list of possibilities.
Interesting idea. That would imply that there is a fact out there that, once known, would change my ethical beliefs, which I take to be a large part of my utility function, AND would do so in such a way that afterward, I would assent to acting on the new utility function.
But one of the things that Me(now) values is updating my beliefs based on information. If there is a fact that shows that my utility function is misconstrued, I want to know it. I don't expect such a fact to surface, but I don't have a problem imagining such a fact existing. I've actually ... (read more)