I think you point up the problem with your own suggestion - we have to have examples of rationality failure to discuss, and if we choose an example on which we agree less (eg something to do with AGW) then we will end up discussing the example instead of what it is intended to illustrate. We keep coming back to religion not just because practically every failure of rationality there is has a religious example, but because it's something we agree on.
I think there are other examples with just as much agreement on their wrongness, many of which have a much lower degree of investment even for their believers. Astrology for instance has many believers, but they tend to be fairly weak beliefs, and don't produce such a defensive reaction when criticized. Lots of other superstitions also exist, so sadly I don't think we'll run out of examples any time soon.
By adopting and sticking to such a strategy, I will on average come out ahead in a wide variety of Newcomblike situations.
Definitely.
I believe I am now in fact hacked to hardcode the future-irrational decision if it does occur.
Here lies my problem. I would like to adopt such a strategy (or a better one if any exists), and not alter my strategy when I actually encounter a Newcomblike situation. Now in the original Newcomb problem, I have no reason to do so: if I alter my strategy so as to two-box, then I will end up with less money (although I would have difficulties proving this in the formalism I use in the article). But in the mugging problem, altering my strategy to "keep $100 in this instance only" will, in an (Omega appears, coin is tails) state, net me more money. Therefore I believe that keeping to my strategy must have intrinsic value to me, greater than that of the $100 I would lose, in order for me to keep it.
Now I can answer your question about how the MAD brain-hack and the mugging brain-hack are related. In the MAD situation, the institutions actions are "hardcoded" to occur. In the case of the mugging brain-hack, this would count as, say, wiring a device to one's brain that takes over in Omega situations. This may well be possible in some situations, but I wanted to deal with the harder problem of how to fashion the brain that, on learning it is in a "tails" state, does not then want to remove such a hack.
Now if I expect to be faced with many Omega mugging problems in the future, then a glimmer of hope appears; although "keep $100 in this instance only" may then seem to be an improved strategy, I know that this conclusion must in fact be incorrect, as whatever process I use to arrive at it is, if allowed to operate, highly likely to lose money for me in the future. In other words, this makes the problem more similar to Newcomb's problem: in the states of the world in which I make the modification, I lose money <-> in the states of the world in which I two-box, I make less money. But the problem as posed involves an Omega turning up and convincing you that this problem is the last Newcomblike problem you will ever face.
ETA: In case it wasn't clear, if I assign intrinsic value > keeping $100 to keeping my strategy, then I will surely keep my strategy. My question is: in the case of Omega appearing and my becoming convinced that I am facing my last ever Newcomblike problem, will keeping my strategy still have intrinsic value to me?
It all depends on how the hack is administered. If future-me does think rationally, he will indeed come to the conclusion that he should not pay. Any brain-hack that will actually be successful must then be tied to a superseding rational decision or to something other than rationality. If not tied to rationality, it needs to be a hardcoded response, immediately implemented, rather than one that is thought about.
There are obvious ways to set up a superseding condition: put $101 in escrow, hire an assassin to kill you if you renege, but obviously the cost from doing this now is far higher than is justified by the probability of the situation, so we need something completely free. One option is to tie it to something internally valued. eg, you value your given word or self-honesty sufficiently that living with yourself after compromising it is worse than a negative $100 utility. (This only scales to the point where you value integrity however: you may be able to live with yourself better after finding you're self deluding than after murdering 15 people to prove a point)
Had we access to our own source code, and capacity for self-modification, we could put a hardcoded path when this decision arises. Currently we have to work with the hardware we have, but I believe our brains do have mechanisms for tying future decisions to then-irrational decisions . Making credible threats requires us to back up what we say, even to someone who we will never encounter again afterwards, so similar situations (without the absolute predictive ability) are quite common in life. I know in the past I have acted perversely against my own self-interest to satisfy a past decision / issued threat. In most cases this should be considered irrationality to be removed from myself, but I think I can reuse the same mechanism to achieve an improvement here.
Obviously I can only guess whether this will in fact work in practice. I believe it will for the $100 case, but suspect that with some of the raised stakes examples given (committing murder etc), my future self may wiggle out of the emotional trap I've set for him. This is a flaw with my brain-hacking methods however - hardcoding would still be the right thing to do if possible, if the payoff were one that I would willingly trade the cost for.
Isn't that exactly the same as the desired effect of your brain-hack in the mugging situation? Instead of removing the ability to not retaliate, we want to remove the ability to not pay... the mugging situation finds you directly in the loss case ... potentially before you'd even heard of Omega.
OK, so to clarify, the problem you're considering is the one where, with no preparation on your part, Omega appears and announces tails?
EDIT: Oops. Clearly you don't mean that. Do you want me to imagine a general hack we can make that increases our expected utility conditional on Omega appearing, but that we can profitably make even without having proof or prior evidence of Omega's existence?
EDIT 2: I do want to answer your question "Isn't that exactly the same as the desired effect of your brain-hack in the mugging situation?", but I'd rather wait on your reply to mine before I formulate it.
Yes, exactly. I think this post by MBlume gives the best description of the most general such hack needed:
If there is an action to which my past self would have precommited, given perfect knowledge, and my current preferences, I will take that action.
By adopting and sticking to such a strategy, I will on average come out ahead in a wide variety of Newcomblike situations. Obviously the actual benefit of such a hack is marginal, given the unlikeliness of an Omega-like being appearing, and me believing it. Since I've already invested the effort through considering the optimal route for the thought experiment though, I believe I am now in fact hacked to hardcode the future-irrational decision if it does occur.
Okay, I agree that this level of precomitting is not necessary. But if the deal is really a one-time offer, then, when presented with the case of the coin already having come up tails, you can no longer ever benefit from being the sort of person who would precommit. Since you will never again be presented with a newcomb-like scenario, then you will have no benefit from being the precommiting type. Therefore you shouldn't give the $100.
If, on the other hand, you still expect that you can encounter some other Omega-like thing which will present you with such a scenario, doesn't this make the deal repeatable, which is not how the question was formulated?
If you think that through and decide that way, then your precommitting method didn't work. The idea is that you must somehow now prevent your future self from behaving rationally in that situation - if they do, they will perform exactly the thought process you describe. The method of doing so, whether making a public promise (and valuing your spoken word more than $100), hiring a hitman to kill you if you renege or just having the capability of reliably convincing yourself to do so (effectively valuing keeping faith with your self-promise more than $100) doesn't matter so long as it is effective. If merely deciding now is effective, then that is all that's needed.
If you do then decide to take the rational course in the losing coinflip case, it just means you were wrong by definition about your commitment being effective. Luckily in this one case, you found it out in the loss case rather than the win case. Had you won the coin flip, you would have found yourself with nothing though.
If I can take this back to the "agents maximising their utility" interpretation: this is then a genuine example of a brain hack, the brain in this case being the institutional decision structure of a Cold War government (lets say the Soviets). Having decided that only by massively retaliating in the possible world where America has attacked is there a win, and having realised that as currently constituted the institution would not retaliate under those circumstances, the institution modified itself so that it would retaliate under those circumstances. I find it interesting that it would have to use irrational agents (the retaliatory officers) as part of its decision structure in order to achieve this.
This points to another difference between Omega mugging and MAD: whereas in the former, its assumed you have the chance to modify yourself in between Omega appearing and your making the decision, in the MAD case, it is deliberately arranged that retaliation is immediate and automatic (corresponding to removing the ability not to retaliate from the Soviet command structure).
Yes - it is effectively the organisational level of such a brain hack (though it would be advantageous if the officers were performing such a hack on their own brains, rather than being irrational in general - rationality in other situations is a valuable property in those with their fingers on the button.)
In the MAD case, it is deliberately arranged that retaliation is immediate and automatic
Isn't that exactly the same as the desired effect of your brain-hack in the mugging situation? Instead of removing the ability to not retaliate, we want to remove the ability to not pay. The methods differ (selecting pre-hacked / appropriately damaged brains to make the decisions, versus hacking our own), but the outcome seems directly analogous. Nor is there any further warning: the mugging situation finds you directly in the loss case (as you'd presumably be directly in the win case if the coin flip went differently) potentially before you'd even heard of Omega. Any brain-hacking must occur before the situation comes up unless you're already someone who would pay.
Many countries involved in protracted disputes do commit atrocities against third parties every now and again; perhaps not for this reason, though.
The problem is that "generally rational, reasonable people" will generally remain so even if attacked. It's much easier to convince an enemy that you are irrational, to some extent. If you can hide your level of rationality, then in a game like MAD you increase your expected score and reduce your opponent's by reducing the information available to them.
One difference between MAD and the Omega mugging is that Omega is defined so as to make any such concealment useless.
ETA: This (short and very good) paper by Yamin Htun discusses the kind of irrationality I mean. Quote:
the rational players disguise themselves as irrational; they make others believe they are altruistic, thus forcing others to play cooperatively.
Substitute "anti-altruistic" for "altruistic" and this is what I was aiming at.
But that fooling can only go so far. The better your opponent is at testing your irrational mask, the higher the risk of them spotting a bluff, and thus the closer the gap between acting irrational and being irrational. Only by being irrational can you be sure they won't spot the lie.
Beyond a certain payoff ratio, the risk from being caught out lying is bigger than the chance of having to carry through. For that reason, you end up actually appointing officers who are will actually carry through - even to the point of blind testing them with simulated tests and removing those who don't fire in such positions (even if it was the right choice )and letting your opponent know and verify this as much as possible.
It follows that you should convince an enemy you actually find killing innocent civilians pleasurable, and are looking for an excuse to do so.
That would seem to be a very easy thing for them to test. Unless we keep committing atrocities every now and again to fool them, they're going to work out that it's false. Even if they do believe us (or it's true), that would itself be a good argument why our leaders would want to start the war - leading to the conclusion that they should do so to get the first strike advantage, maximising their chances.
It would seem better to convince them in some way that doesn't require us to pay such a cost if possible: and to convince the enemy that we're generally rational, reasonable people except in such circumstances where they attack us.
I am admittedly amazed that so much intellectual energy is devoted to a question that is not only an extremely improbable hypothetical but one that has absolutely no implications for our daily lives or the rest of our endeavors.
A question I wish to ask you all: why are you thinking about this subject?
I don't think that's true. I mentioned one real-world case that is very close to the hypothesised game in the other post: the Mutually Assured Destruction policy, or ultimatums in general.
First note that Omega's perfection as a predictor is not neccessary. With an appropriate payoff matrix even a 50.1% accurate omega doesn't change the optimal strategy. (One proviso on this is that the method of prediction must be such that it is non-spoofable. For example, I could perhaps play Omega with a 90% success rate, but knowing that I don't have access to brain-scanning abilities, you could probably conclude that I'm using more mundane ways (like reading the responses people give on blog posts about Newcomb's paradox) and so would be able to fool me (though this might not hurt my percentage much if I predict people smart enough to do this will two-box, it does change the optimal strategy because you now know you've already lost no matter what))
With MAD, the situation is similar:
In the event that the enemy launch a nuclear attack, it is irrational (from a life valuing sense) to destroy millions of innocent civilians when it won't help you. This corresponds to the "pay $100 when the coin comes up tails".
Prior to war, it is advantageous for the enemy to predict that you would destroy the world. If he believes that, then a first attack is a net loss for them, so he doesn't destroy your half of the world (the win $10000 case)
The object then is to convince the enemy that you would make the irrational decision in the loss case. But you must assume an intelligent enemy, with access to whatever technology or situations that might possibly be developed / occur in the future (truth drugs? Kidnapping and fooling a decision maker under a controlled environment and seeing what they do? something we haven't even thought of?) The only way to be sure that no test will reveal you're bluffing is not to bluff.
Why then would it violate free will if Omega could predict your accuracy in this different situation (one where he's also able to predict the effects of him telling you) to a similar precision?
Because that's pretty much our intuitive definition of free will; that it is not possible for someone to predict your actions, announce it publicly, and still be correct. If you disagree, we are disagreeing about the intuitive definition of "free will" that most people carry around in their heads. At least admit that most people would be unsurprised if a person predicted that they would (e.g.) brush their teeth in the morning (without telling them in advance that it had predicted that), versus predicting that they would knock a vase over, and then as a result of that prediction, the vase actually getting knocked over.
Then take my bet situation. I announce your attendance, and cut you in with a $25 stake in attendance. I don't think it would be unusual to find someone who would indeed appear 99.99% of the time - does that mean that person has no free will?
People are highly, though not perfectly, predictable under a large number of situations. Revealing knowledge about the prediction complicates things by adding feedback to the system, but there are lots of cases where it still doesn't change matters much (or even increases predictability). There are obviously some situations where this doesn't happen, but for Newcombe's paradox, all that is needed is a predictor for the particular situation described, not any general situation. (In fact Newcombe's paradox is equally broken by a similar revelation of knowledge. If Omega were to reveal its prediction before the boxes are chosen, a person determined to do the opposite of that prediction opens it up to a simple Epimenides paradox.)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Rationality can be life and death, but that applies to collective and institutional decisions just as much as for our individual ones. Arguably more so: the decisions made by governments, cultures and large institutions have far larger effects than any decision I'll ever make. Investment into improving my individual rationality is more valuable purely due to self-interest - we may invest more to providing a 1% improvement to our own lives than we do to reducing collective decision making mistakes that costs thousands of lives a year. But survival isn't the only goal we have! Even if it were, there are good reasons to put more emphasis on collective rational decision making - the decisions of others can also affect us.