Fun fact: A fellow Rationalist and I were doing Rejection Therapy. My friend chose to do Pascal's Mugging (the positive version - if you give me $5 now, a package of $50000 will appear at your doorstep tomorrow morning).
The subject came extremely close to actually giving him the $5, even though the subject only had five dollars and needed it to get home. (My friend added that a cab would arrive in five minutes if he waited at a particular intersection and take him home for free). He only stopped when I burst out laughing. (It took maybe a 5-10 minute conversation to build up to that point)
We talked to him about it afterwards to ask about his motivations. He said the logic made sense to him and my friend did a good job of maintaining the persona.
I don't know if anyone knows exactly what Bob is doing, but at a stab, he's seeing how many unpleasant feelings get generated by imagining the crime, then proposing a jail sentence that activates about an equal amount of unpleasant feelings. If the thought of a homeless man makes images of crime more readily available and so increases the unpleasant feelings, things won't go well for the homeless man.
To defend poor Bob for a moment, it's worth noting that we don't respond to numbers well in a vacuum. A theft involving a hedge fund manager invokes a frame in which a million dollars isn't that much. A theft involving a homeless person invokes a frame in which a thousand dollars is a lot. I suspect that this magnitude distortion explains more of Bob's behavior than general negative affect towards homeless people.
ETA: Both to mitigate that annoying LW effect where the top-voted comment on an excellent article is always a correction or quibble, and just because it's plain true, I should add that I'm thoroughly enjoying this sequence, and that my rate-of-checking-LW has risen sharply over the last couple weeks since I've been looking for another installment.
I have not yet accepted that consistency is always the best course in every situation. For example, in Pascal's Mugging, a random person threatens to take away a zillion units of utility if you don't pay them $5. The probability they can make good on their threat is miniscule, but by multiplying out by the size of the threat, it still ought to motivate you to give the money. Some belief has to give - the belief that multiplication works, the belief that I shouldn't pay the money, or the belief that I should be consistent all the time - and right now, consistency seems like the weakest link in the chain.
No, no, no!
There are an infinite number of possible Pascal's muggings, but people only look at them one at a time. Why don't you keep the $5 in case you need it for the next Pascal's mugger who offers you 2^zillion units of utility? That is a much better bet if you only look at those two possible muggings.
The real problem is that utility functions, as we calculate them now, do not converge. This is a reason to be confused, not a reason to bite such ridiculous bullets.
My tendency is to assume that the homeless man would steal the $1000 via violent means, whereas the hedge fund manager would steal the $1 million using nonviolent deception. In addition to a belief that violent crime is actually worse, there is also the bias that it is easier to visualize. A homeless man stealing $1000 looks like a man pointing a gun at a cashier. A hedge fund manager stealing $1 million looks like a guy at a computer with a spreadsheet open.
Of course, I work at a hedge fund manager right now, so I have additional biases.
I have not yet accepted that consistency is always the best course in every situation. For example, in Pascal's Mugging, a random person threatens to take away a zillion units of utility if you don't pay them $5. The probability they can make good on their threat is miniscule, but by multiplying out by the size of the threat, it still ought to motivate you to give the money. Some belief has to give - the belief that multiplication works, the belief that I shouldn't pay the money, or the belief that I should be consistent all the time - and right now, consistency seems like the weakest link in the chain.
Not upvoted, for this paragraph. You can't become right by removing beliefs at random until the remaining belief pool is consistent, but if you're right then you must be consistent.
Why does some belief have to give, if you reject consistency? If you're going to be inconsistent, why not inconsistently be consistent as well?
Also, you are attempting to be humorous by including beliefs like "multiplication works", but not beliefs like "at the 3^^^3rd murder, I'm still horrified" or "Solomonoff induction works", right?
We are but humble bounded rationalists, who have to use heuritistic soup, so we might have to be inconsistent at times. But to say that even after careful recomputation on perfectly formalized toy problems, we don't have to be consistent? Oh, come on!
There are several good reasons why I should not assign a probability of 66% to heads and 66% to tails, but one of the clearest is this: you can make me a bet that I will give you $2 if it lands on tails and you give me $1 if it lands on heads, and then a second bet where I give you $2 if it lands on heads and you give me $1 if it lands on tails.
Got it.
Whichever way the coin lands, I owe you $1 and you owe me $2 - I have gained a free dollar.
Huh? You swapped "you" for "I" here (compared to above).
I'm still confused about what point Yvain might be making by substituting "tendency" for "intuition" in this formulation of reflective equilibrium. I can think of two possibilities, but neither of them seems like something he might endorse.
I don't think that Pascal's Mugging puts pressure on Bayesianism, I think it puts pressure on Solomonoff-type priors - Robin's anthropic answer is the one I currently find most appealing. The Lifespan Dilemma puts a lot more pressure on EU, in my book.
Valuing consistency is silly. If someone suggests putting one thief in jail and letting another go free, you won't object because it's inconsistent. You'll either object because you don't think thieves should go to jail, or because you don't think the should go free. Inconsistency just makes it easier to give a reason why it's wrong. You don't need to know whether or not a given person thinks thieves should be jailed to convince them that that isn't the best thing to do.
If you don't accept Pascal's mugging, you have to have some reason for it. The same goe...
by multiplying out by the size of the threat, it still ought to motivate you to give the money. Some belief has to give - the belief that multiplication works, the belief that I shouldn't pay the money, or the belief that I should be consistent all the time - and right now, consistency seems like the weakest link in the chain.
What gives is the belief that by multiplying out by the size of the threat, it still ought to motivate me to give the money. Multiplication works, I shouldn't pay the money, and I should be consistent.
I don't know if anyone knows exactly what Bob is doing, but at a stab, he's seeing how many unpleasant feelings get generated by imagining the crime, then proposing a jail sentence that activates about an equal amount of unpleasant feelings.
See the outrage heuristic, Kahneman & Frederick (2002) (pdf).
Any tendency that has reached reflective equilibrium in your current state is about as close to a preference as you're going to get.
But if you know your destination, you're already there. In principle, there is no need to wait for a tendency to manifest, or even to require that the conditions making the tendency manifest ever hold, if you know the way it'd go (not that you should just step back and watch). There are also one-off decisions that require knowing what to do this one time, where the intuition about reflective equilibrium applies less, and it...
......except that the Dutch book itself assumes consistency. If I believe that there is a 66% chance of it landing on heads, but refuse to take a bet at 2:1 odds - or even at 1.5:1 odds even though I should think it's easy money! - then I can't be Dutch booked. I am literally too stupid to be tricked effectively. You would think this wouldn't happen too often, since people would need to construct an accurate mental model to know when they should refuse such a bet, and such an accurate model would tell them they should revise their probabilities - but time af
For example, in Pascal's Mugging, a random person threatens to take away a zillion units of utility if you don't pay them $5. The probability they can make good on their threat is miniscule, but by multiplying out by the size of the threat, it still ought to motivate you to give the money.
Why? Hasn't this been gone over before? Tiny number * big number = not determined by the words "tiny" and "big".
Consider a case, not too different from what has been shown to happen in reality, where we ask Bob what sounds like a fair punishment for a homeless man who steals $1,000, and he answers ten years. Suppose we wait until Bob has forgotten that we ever asked the first question, and then ask him what sounds like a fair punishment for a hedge fund manager who steals $1,000,000, and he says five years. Maybe we even wait until he forgets the whole affair, and then ask him the same questions again with the same answers, confirming that these are stable preferences.
If we now confront Bob with both numbers together, informing him that he supported a ten year sentence for stealing $1,000 and a five year sentence for stealing $1,000,000, a couple of things might happen. He could say "Yeah, I genuinely believe poor people deserve greater penalties than rich people." But more likely he says "Oh, I guess I was prejudiced." Then if we ask him the same question again, he comes up with two numbers that follow the expected mathematical relationship and punish the greater theft with more jail time.
Bob isn't working off of some predefined algorithm for determining punishment, like "jail time = (10 * amount stolen)/net worth". I don't know if anyone knows exactly what Bob is doing, but at a stab, he's seeing how many unpleasant feelings get generated by imagining the crime, then proposing a jail sentence that activates about an equal amount of unpleasant feelings. If the thought of a homeless man makes images of crime more readily available and so increases the unpleasant feelings, things won't go well for the homeless man. If you're really hungry, that probably won't help either.
So just like nothing automatically synchronizes the intention to study a foreign language and the behavior of studying it, so nothing automatically synchronizes thoughts about punishing the theft of $1000 and punishing the theft of $1000000.
Of course, there is something that non-automatically does it. After all, in order to elicit this strange behavior from Bob, we had to wait until he forgot about the first answer. Otherwise, he would have noticed and quickly adjusted his answers to make sense.
We probably could represent Bob's tendencies as an equation and call it a preference. Maybe it would be a long equation with terms for net worth of criminal, amount stolen, how much food Bob's eaten in the past six hours, and whether his local sports team won the pennant recently, with appropriate coefficients and powers for each. But if Bob saw this equation, he certainly wouldn't endorse it. He'd probably be horrified. It's also unstable: if given a choice, he would undergo brain surgery to remove this equation, thus preventing it from being satisfied. This is why I am reluctant to call these potential formalizations of these equations a "preference".
Instead of saying that Bob has one preference determining his jail time assignments, it would be better to model him as having several tendencies - a tendency to give a certain answer in the $1000 case, a tendency to give a different answer in the $1000000 case, and several tendencies towards things like consistency, fairness, compassion, et cetera.
People strongly consciously endorse these latter tendencies, probably because they're socially useful1. If the Chief of Police says "I know I just put this guy in jail for theft, but I'm going to let this other thief off because he's my friend, and I don't really value consistency that much," then they're not going to stay Chief of Police for very long.
Bayesians and rationalists, in particular, make a big deal out of consistency. One common parable on the importance of consistency is the Dutch Book - a way to get free money from anyone behaving inconsistently. Suppose you have a weighted coin which can land on either heads or tails. There are several good reasons why I should not assign a probability of 66% to heads and 66% to tails, but one of the clearest is this: you can make me a bet that I will give you $2 if it lands on tails and you give me $1 if it lands on heads, and then a second bet where I give you $2 if it lands on heads and you give me $1 if it lands on tails. Whichever way the coin lands, I owe you $1 and you owe me $2 - I have gained a free dollar. So consistency is good if you don't want to be handing dollars out to random people...
...except that the Dutch book itself assumes consistency. If I believe that there is a 66% chance of it landing on heads, but refuse to take a bet at 2:1 odds - or even at 1.5:1 odds even though I should think it's easy money! - then I can't be Dutch booked. I am literally too stupid to be tricked effectively. You would think this wouldn't happen too often, since people would need to construct an accurate mental model to know when they should refuse such a bet, and such an accurate model would tell them they should revise their probabilities - but time after time people have demonstrated the ability to do exactly that.
I have not yet accepted that consistency is always the best course in every situation. For example, in Pascal's Mugging, a random person threatens to take away a zillion units of utility if you don't pay them $5. The probability they can make good on their threat is miniscule, but by multiplying out by the size of the threat, it still ought to motivate you to give the money. Some belief has to give - the belief that multiplication works, the belief that I shouldn't pay the money, or the belief that I should be consistent all the time - and right now, consistency seems like the weakest link in the chain.
The best we can do is seek reflective equilibrium among our tendencies. If you endorse the belief that rich people should not get lighter sentences than poor people more strongly than you endorse the tendency to give the homeless man ten years in jail and the fund manager five, then you can edit the latter tendency and come up with a "fair" sentence. This is Eliezer's defense of reason and philosophy, a powerful justification for morality (see part one here) and it's probably the best we can do in justifying our motivations as well.
Any tendency that has reached reflective equilibrium in your current state is about as close to a preference as you're going to get. It still won't automatically motivate you, of course. But you can motivate yourself toward it obliquely, and come up with the course of action that you most thoroughly endorse.
FOOTNOTES:
1: A tendency toward consistency can cause trouble if someone gains advantage from both of two mutually inconsistent ideas. Trivers' hypothesis predicts that people will consciously deny the inconsistency so they can continue holding both ideas, yet still remain consistent and so socially acceptable. Rationalists are so annoying because we go around telling people they can't do that.