Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Is risk aversion really irrational ?

41 Post author: kilobug 31 January 2012 08:34PM
Disclaimer: this started as a comment to Risk aversion vs. concave utility function but it grew way too big so I turned it into a full-blown article. I posted it to main since I believe it to be useful enough, and since it replies to an article of main.

Abstract

When you have to chose between two options, one with a certain (or almost certain) outcome, and another which involves more risk, even if in term of utilons (paperclips, money, ...) the gamble has a higher expectancy, there is always a cost in a gamble : between the time when you take your decision and know if your gamble fails or succeeded (between the time you bought your lottery ticket,and the time the winning number is called), you've less precise information about the world than if you took the "safe" option. That uncertainty may force you to make suboptimal choices during that period of doubt, meaning that "risk aversion" is not totally irrational.

Even shorter : knowledge has value since it allows you to optimize, taking a risk temporary lowers your knoweldge, and this is a cost.

Where does risk aversion comes from ?

In his (or her?) article, dvasya gave one possible reason for it : risk aversion comes from a concave utility function. Take food for example. When you're really hungry, didn't eat for days, a bit of food has a very high value. But when you just ate, and have some stocks of food at home, food has low value. Many other things follow, more or less strongly, a non-linear utility function.

But if you adjust the bets for the utility, then, if you're a perfect utility maximizer, you should chose the highest expectancy, regardless of the risk involved. Between being sure of getting 10 utilons and having a 0.1 chance of getting 101 utilons (and 0.9 chance to get nothing), you should chose to take the bet. Or you're not rational, says dvasya.

My first objection to it is that we aren't perfect utility maximizer. We run on limited (and flawed) hardware. We have a limited power for making computation. The first problem of taking a risk is that it'll make all further computations much harder. You buy a lottery ticket, and until you know if you won or not, every time you decide what to do, you'll have to ponder things like "if I win the lottery, then I'll buy a new house, so is it really worth it to fix that broken door now ?" Asking yourself all those questions mean you're less Free to Optimize, and will use your limited hardware to ponder those issues, leading to stress, fatigue and less-efficient decision making.

For us humans with limited and buggy hardware, those problems are significant, and are the main reason for which I am personally (slightly) risk-averse. I don't like uncertainty, it makes planning harder, it makes me waste precious computing power in pondering what to do. But that doesn't seem apply to a perfect utility maximizer, with infinite computing power. So, it seems to be a consequence of biases, if not a bias in itself. Is it really ?

The double-bet of Clippy

So, let's take Clippy. Clippy is a pet paper-clip optimizer, using the utility function proposed by dvasya : u = sqrt(p), where p is the number of paperclips in the room he lives in. In addition to being cute and loving paperclips, our Clippy has lots of computing power, so much he has no issue with tracking probabilities. Now, we'll offer our Clippy to take bets, and see what he should do.

Timeless double-bet

At the beginning, we put 9 paperclips in the room. Clippy has a utilon of 3. He purrs a bit to show us he's happy of those 9 paperclips, looks at us with his lovely eyes, and hopes we'll give him more.

But we offer him a bet : either we give him 7 paperclips, or we flip a coin. If the coin comes up heads, we give him 18 paperclips. If it comes up tails, we give him nothing.

If Clippy doesn't take the bet, he gets 16 paperclips in total, so u=4. If Clippy takes the bet, he has 9 paperclips (u=3) with p=0.5 or 9+18=27 paperclips (u=5.20) with p=0.5. His utility expectancy is u=4.10, so he should take the bet.

Now, regardless of whatever he took the first bet (called B1 starting from now), we offer him a second bet (B2) : this time, he has to pay us 9 paperclips to enter. Then, we roll a 10-sided die. If it gives 1 or 2, we give him a jackpot of 100 paperclips, else nothing. Clippy can be in three states when offered the second deal :

  1. He didn't take B1. Then, he has 16 clips. If he doesn't take B2, he'll stay with 16 clips, and u=4. If takes B2, he'll have 7 clips with p=0.8 and 107 clips with p=0.2, for an expected utility of u=4.19.
  2. He did take B1, and lost it. He has 9 clips. If he doesn't take B2, he'll stay with 9 clips, and u=3. If takes B2, he'll have 0 clips with p=0.8 and 100 clips with p=0.2, for an expected utility of u=2.
  3. He did take B1, and won it. He has 27 clips. If he doesn't take B2, he'll stay with 27 clips, and u=5.20. If takes B2, he'll have 18 clips with p=0.8 and 118 clips with p=0.2, for an expected utility of u=5.57.

So, if Clippy didn't take the first bet or if he won it, he should take the second bet. If he did take the first bet and lost it, he can't afford to take the second bet, since he's risking a very bad outcome : no more paperclips, not even a single tiny one !

And the devil "time" comes in...

Now, let's make things a bit more complicated, and realistic. Before we were running things fully sequentially : first we resolved B1, and then we offered and resolved B2. But let's change a tiny bit B1. We don't flip the coin and give the clips to Clippy now. Clippy tells us if he takes B1 or not, but we'll wait one day before giving him the clips if he didn't take the bet, or before flipping the coin and then giving him the clips if he did take the bet.

The utility function of Clippy doesn't involve time, and we'll consider it doesn't change if he gets the clips tomorrow instead of today. So for him, the new B1 is exactly like the old B1.

But now, we offer him B2 after Clippy made his choice in B1 (taking the bet or not) but before flipping the coin for B1, if he did take the bet.

Now, for Clippy, we only have two situations : he took B1 or he didn't. If he didn't take B1, we are in the same situation than before, with an expected utility of u=4.19.

If he did take B1, we have to consider 4 possibilities :

  1. He loses the two bets. Then he ends up with no paperclip (9+0-9), and is very unhappy. He has u=0 utilons. That'll arise with p=0.4.
  2. He wins B1 and loses B2. Then he ends up with 9+18-9 = 18 paperclips, so u=4.24 with p=0.4.
  3. He loses B1 and wins B2. Then he ends up with 9-9+100 = 100 paperclips, so u=10 with p = 0.1.
  4. He wins both bets. Then he gets 9+18-9+100 = 118 paperclips, so u=10.86 with p=0.1.

At the end, if he takes B2, he ends up with an expectancy of u=3.78.

So, if Clippy takes B1, he then shouldn't take B2. Since he doesn't know if he won or lost B1, he can't afford the risk to take B2.

But should he take B1 at first ? If, when offered to take B1, he knows he'll be offered to take B2 later on, then he should refuse B1 and take B2, for an utility of 4.19. If, when offered B1, he doesn't know about B2, then taking B1 seems the more rational choice. But once he took B1, until he knows if he won or not, he cannot afford to take B2.

The Python code

For people interested about those issues, here is a simple Python script I used to fine tune that numerical parameters of  double-bet issue so my numbers lead to the problem I was pointing to. Feel free to play with it ;)

A hunter-gatherer tale

If you didn't like my Clippy, despite him being cute, and purring of happiness when he sees paperclips, let's shift to another tale.

Daneel is a young hunter-gatherer. He's smart, but his father committed a crime when he was still a baby, and was exiled from the tribe. Daneel doesn't know much about the crime - no one speaks about it, and he doesn't dare to bring the topic by himself. He has a low social status in the tribe because of that story. Nonetheless, he's attracted to Dors, the daughter of the chief. And he knows Dors likes him back, for she always smiles at him when she sees him, never makes fun of him, and gave him a nice knife after his coming-of-age ceremony.

According to the laws of the tribe, Dors can chose her husband freely, and the husband will become the new chief. But Dors also have to chose a husband that is accepted by the rest of the tribe, if the tribe doesn't accept the leadership, they could revolt, or fail to obey. And that could lead to disaster for the whole tribe. Daneel knows he has to raise his status in the tribe if he wants Dors to be able to chose him.

So Daneel wanders further and further in the forest. He wants to find something new to show the tribe his usefulness. That day, going a bit further than usual, he finds a place which is more humid than the forest the tribe usually wanders in. It has a new kind of trees, he never saw before. Lots of them. And they carry a yellow-red fruit which looks yummy. "I could tell about that place to the others, and bring them a few fruits. But then, what if the fruit makes them sick ? They'll blame me, I'll lose all chances... they may even banish me. But I can do better. I'll eat one of the fruits myself. If tomorrow I'm not sick, then I'll bring fruits to the tribe, and show them where I found them. They'll praise me for it. And maybe Dors will then be able to take me more seriously... and if I get sick, well, everyone gets sick every now and then, just one fruit shouldn't kill me, it won't be a big deal". So Daneel makes his utility calculation (I told you he was smart !), finds a positive outcome. So he takes the risk, he picks one fruit, and eats it. Sweet, a bit acid but not too much. Nice !

Now, Daneel goes back to the tribe. On the way back, he got a rabbit, a few roots and plants for the shaman, an average day. But then, he sees the tribe gathered around the central totem. In the middle of the tribe, Dors with... no... not him... Eto ! Eto is the strongest lad of Daneel's age. He wants Dors too. And he's strong, and very skilled with the bow. The other hunters like him, he's a real man. And Eto's father died proudly, defending the tribe's stock of dried meat against hungry wolves two winters ago. But no ! Not that ! Eto is asking Dors to marry him. In public. Dors can refuse, but if she does with no reason, she'll alienate half of the tribe against her, she can't afford it. Eto is way too popular.

"Hey, Daneel ! You want Dors ? Challenge Eto ! He's strong and good with the bow, but in unarmed combat, you can defeat him, I know it.", whispers Hari, one of the few friends of Daneel.

Daneel starts thinking faster he never did. "Ok, I can challenge Eto in unarmed combat. If I lose, I'll be wounded, Eto won't be nice with me. But he won't kill or cripple me, that would make half of the tribe to hate him. If I lose, it'll confirm I'm physical weak, but I'll also win prestige for daring to defy the strong Eto, so it shouldn't change much. And if I win, Dors will be able to refuse Eto, since he lost a fight against someone weaker than him, that's a huge win. So I should take that gamble... but then, there is the fruit. If the fruit gets me sick, in addition of my wounds from Eto, I may die. Even if I win ! And if I lose, get beaten, and then gets sick... they'll probably let me die. They won't take care of a fatherless lad who lose a fight and then gets sick. Too weak to be worth it. So... should I take the gamble ? If Eto waited just one day more... Or if only I knew if I'll get sick or not..."

The key : information loss

Until Clippy knows ? If Daneel knew ? That's the key of risk aversion, and why a perfect utility maximizer, if he has a concave utility function in at least some aspects, should still have some risk aversion. Because risk comes with information loss. That's the difference between the timeless double-bet and the one with one day of delay for Clippy. Or the problem Daneel got stuck into.

If you take a bet, until you know the outcome of your bet, you'll have less information about the state of the world, and especially about the state that directly concerns you, than if you chose the safe situation (a situation with a lower deviation). Having less information means you're less free to optimize.

Even a perfect utility maximizer can't know what bets he'll be offered, and what decisions he'll have to take, unless he's omniscient (and then he wouldn't take bets or risks, but he would know the future - probability only reflects lack of information). So he has to consider the loss of information of taking a bet.

In real life, the most common case of it is the non-linearity of bad effects : you can lose 0.5L of blood without too much side-effects (drink lots a water, sleep well, and next day you're ok, that's what happens when you go give your blood), but if you lose 2L, you'll likely die. Or if you lose some money, you'll be in trouble, but if you lose the same amount again, you may end up being kicked from you house since you can't pay the rent - and that'll be more than twice as bad as the initial lost.

So when you took a bet, risking to get a bad effect, you can't afford to take another bet (even with, in absolute, a higher gain expectancy), until you know if you won or lose the first bet - because losing them both means death, or being kicked from your house, or ultimate pain of not having any paperclip.

Taking a bet always as a cost : it costs you part of your ability to predict, and therefore to optimize.

A possible solution

A possible solution to that problem would be to consider all possible decisions you may to take while in the time period when you don't know if you lost or won your first bet, ponder them with the probability of being offered those decisions, and their possible outcomes if you take the first bet and you don't. But how do you compute "their possible outcomes" ? That needs to consider all the possible bets you could be offered during the time required for the resolution of your second bet, and their possible outcomes. So you need to... stack overflow: maximal recursion depth exceeded.

Since taking a bet will affect your ability to evaluate possible outcomes in the future, you've a "strange loop to the meta-level", an infinite recursion. Your decision algorithm has to consider the impact the decision will have on the future instances of your decision algorithm.

I don't know if there is a mathematical solution to that infinite recursion that manages to make it converge (like you can in some cases). But the problem looks really hard, and may not be computable.

Just factoring an average "risk aversion" that penalizes outcome which involve a risk (and the more you've to wait to know if you won or lose, the higher the penalty) sounds more a way to fix that problem than a bias.

Comments (65)

Comment author: Vaniver 01 February 2012 12:34:35AM 2 points [-]

But if you adjust the bets for the utility, then, if you're a perfect utilitarian, you should chose the highest expectancy, regardless of the risk involved. Between being sure of getting 10 utilons and having a 0.1 chance of getting 101 utilons (and 0.9 chance to get nothing), you should chose to take the bet. Or you're not rational, says dvasya.

It's not "or you're not rational." It's "or you haven't measured your utility function correctly." If you don't pick the option with higher expected utility, it's not actually utility.

We have a limited power for making computation. The first problem of taking a risk is that it'll make all further computations much harder.

So put that in your utility function. The certainty effect is not always a bias.

Comment author: Faber 01 February 2012 02:31:50PM 2 points [-]

If you don't pick the option with higher expected utility, it's not actually utility.

The point is that we may have utility functions where u(p1A+p2B) != p1u(A)+p2u(B). That is, the utility of a bet may not be equal to the expected value of the utility of the outcomes.

Comment author: Vaniver 01 February 2012 04:46:01PM 1 point [-]

The point is that we may have utility functions where u(p1A+p2B) != p1u(A)+p2u(B).

I am well aware. That's only the case for linear, i.e. 'risk neutral', utility functions.

The thing is, utility is defined as the thing you are risk neutral with respect to. If you're not risk neutral with respect to it, then it's not utility.

Comment author: kilobug 01 February 2012 02:51:50PM 5 points [-]

So put that in your utility function.

There are two problems with that.

  1. Utility function is supposed to contain only terminal values. You're not supposed to factor instrumental values into your utility function. It's your optimization algorithm which is supposed to consider instrumental values in they help to maximize utility, but they shouldn't be part of utility function for themselves.

  2. What you want to "put in your utility function" is... the effect of choices on your ability to estimate and optimize your utility function. That's making the utility function recursive, building a "strange loop to the meta level" between your utility and the optimization algorithm which is supposed to maximize the utility function. And I don't see any reason (but maybe there are) why that recursion should converge and be computable in finite time.

Comment author: Vaniver 08 February 2012 05:57:27PM 1 point [-]

Utility function is supposed to contain only terminal values. You're not supposed to factor instrumental values into your utility function.

Utility functions are typically defined over expected futures. A feature of that future is how many seconds and calories you spent making decisions (and thus not doing other things). And so if a gamble will give you either zero or a hundred calories, but take fifty calories to recompute all of your plans that depend on whether or not you win the gamble, then it's actually a gamble between -50 and 50 calories, not 0 and 100.

In short, utility functions should take terminal values as inputs, but those terminal values depend on instrumental values, and your utility function should respond to that dependence.

Comment author: [deleted] 01 February 2012 10:41:02PM 2 points [-]

What you want to "put in your utility function" is... the effect of choices on your ability to estimate and optimize your utility function. That's making the utility function recursive, building a "strange loop to the meta level" between your utility and the optimization algorithm which is supposed to maximize the utility function. And I don't see any reason (but maybe there are) why that recursion should converge and be computable in finite time.

But (essentially to repeat a point) it would be a bias, since the adjustment is based on risk, whereas it should ('assuming everything else) be based on uncertainty (risk multiplied by the length of time the result is unknown). But even if the adjustment were based on the relevant factor, it would still be a bias because the adjustment should concern not only the time but on the chances that relevant decisions will be required in the interval.

A separate point—One topic that should be considered in evaluating the argument further is whether other decision problems introduce the same "strange loops."

Comment author: shminux 01 February 2012 01:49:05AM 1 point [-]

Nice stories, and this is my YAPS (yet another post summary): Risk aversion is rational when the (fully recursed) downside is unbounded. When your downside is bounded (you can evaluate the expected utility of a risky decision with high accuracy (how high is high enough?)), the rational choice is the one with the highest expected utility.

If considered this way, I definitely agree. As others said, in real life the fully-factored downside is hard to evaluate, and it tends to increase out of proportion with the first-order risk. Of course, the same happens to the upside, but the inability to calculate the fully-factored expected utility is a very rational reason to avoid risks.

Comment author: Luke_A_Somers 01 February 2012 03:59:27PM 1 point [-]

Missing the point - Clippy's downside wasn't unbounded. It was just larger because of lack of information.

Comment author: shminux 02 February 2012 01:10:54AM 0 points [-]

Please tell me what the bound was, then.

Comment author: Luke_A_Somers 02 February 2012 02:57:22AM 0 points [-]

0, at 0 paperclips. It's only 1 utilon worse than having 1 paperclip, which is in turn 1 utilon worse than having 4.

Comment author: prase 01 February 2012 07:41:44PM 2 points [-]

The best article on LW recently.

To pick a nit, I was able to guess the author is French from the punctuation being separated from preceding text by spaces.

Comment author: steven0461 31 January 2012 08:46:16PM 0 points [-]

Anyone willing to write a summary?

Comment author: kilobug 31 January 2012 10:09:10PM 9 points [-]

In summary : when you have to chose between two options, one with a certain (or almost certain) outcome, and another which involves more risk, even if in term of utilons (paperclips, money, ...) the gamble has a higher expectancy, there is always a cost in a gamble : between the time when you take your decision and know if your gamble fails or succeeded (between the time you bought your lottery ticket,and the time the winning number is called), you've less precise information about the world than if you took the "safe" option. That uncertainty may force you to make suboptimal choices during that period of doubt, meaning that "risk aversion" is not totally irrational.

Even shorter : knowledge has value since it allows you to optimize, taking a risk lowers your knoweldge.

Comment author: [deleted] 01 February 2012 12:42:01AM 2 points [-]

You're explaining aversion to uncertainty, not risk. What if you think of buying a lottery ticket, do so immediately, and are immediately informed of the outcome (immediate meaning within a few seconds). You then would have endured potentially high risk (if the ticket was very expensive) with negligible uncertainty (only endured for a few seconds, where there's negligible likelihood that you would make contingent choices in the interval). The thought experiment shows that risk aversion obtains without uncertainty.

General comment: You and another poster supplied excellent summaries. What I wonder is why is it (often) seen as necessary to belabor the obvious, when the point can be stated succinctly with greater clarity?

Comment author: steven0461 01 February 2012 02:00:02AM 1 point [-]

General comment: You and another poster supplied excellent summaries. What I wonder is why is it (often) seen as necessary to belabor the obvious, when the point can be stated succinctly with greater clarity?

Seconded. I think many posts in the Main section could and should be cut down by a factor of say 2 or 3.

Comment author: drethelin 03 February 2012 06:44:15AM 0 points [-]

Hindsight bias. Just because something can be summarized relatively simply in hindsight doesn't mean the same summary would be as convincing or even be remembered if it's all you read. It's the same reason math class isn't just you reading a book that starts with axioms and then lists theorems until you know all of math.

Comment author: steven0461 03 February 2012 07:58:01AM 2 points [-]

It can't be hindsight bias in this particular case, because I didn't read the post.

Comment author: drethelin 03 February 2012 08:26:33AM 0 points [-]

Did you not read any of the posts? My point wasn't specific to this post.

Comment author: gjm 02 February 2012 02:30:40AM 2 points [-]

"I have only made this letter so long because I didn't have time to make it shorter." -- Blaise Pascal

Comment author: Viliam_Bur 01 February 2012 11:51:55AM *  3 points [-]

What I wonder is why is it (often) seen as necessary to belabor the obvious, when the point can be stated succinctly with greater clarity?

Summary: Because updates don't propagate automatically. A story with examples has higher chance of triggering a real update. Stories about people having significant losses and gains have the highest chance.

Long version:

I should insert here a story about how I did some faulty reasoning for years, even though I knew the correct theory, because I knew the theory only on a "teacher's password" level, and I didn't realise that my problem is an instance of the theory. I lost my money, my home, my girlfriend, and spent five years in a mental institution. Then I tried to kill myself and almost succeeded.

Lorem ipsum dolor sit amet, consectetuer adipiscing elit. Aenean commodo ligula eget dolor. Aenean massa. Cum sociis natoque penatibus et magnis dis parturient montes, nascetur ridiculus mus. Donec quam felis, ultricies nec, pellentesque eu, pretium quis, sem. Nulla consequat massa quis enim. Donec pede justo, fringilla vel, aliquet nec, vulputate eget, arcu. In enim justo, rhoncus ut, imperdiet a, venenatis vitae, justo. Nullam dictum felis eu pede mollis pretium. Integer tincidunt. Cras dapibus. Vivamus elementum semper nisi. Aenean vulputate eleifend tellus. Aenean leo ligula, porttitor eu, consequat vitae, eleifend ac, enim. Aliquam lorem ante, dapibus in, viverra quis, feugiat a, tellus. Phasellus viverra nulla ut metus varius laoreet. Quisque rutrum. Aenean imperdiet. Etiam ultricies nisi vel augue. Curabitur ullamcorper ultricies nisi. Nam eget dui.

And then I understood my mistake, and now I am wealthy multi-billionaire, a president of my country, with a harem full of most beautiful girls. All because I took this principle seriously. So, my dear friend, don't make the same foolish mistake that I did. You will thank me later.

And by the way all information in this "long version" is fictional, but it serves to prove my point.

Comment author: [deleted] 01 February 2012 07:28:41PM 4 points [-]

A story with examples has higher chance of triggering a real update.

An important point, deserving serious discussion. It's probably the correct answer as to why posters think examples are generally useful. My own view, which I'm prepared to update as necessary, is that the function of examples is to elucidate a claim that's otherwise hard to understand. When the examples are either tedious or harder to comprehend than the claim, I think examples are self-defeating.

One reason is that the comprehension or impact that triggers updates is often greater when readers construct their own examples. Then, there's also the obvious point that if a posting is overly time-consuming or boring, some people who would have read a more succinct posting won't read one that's bloated.

What actually triggers updates deserves a lot more explicit attention. I'm leery of the oft-heard response, "I've updated." How do you know you have? Or if you have, how do you know you've done it correctly, to the appropriate extent? These questions also raise the issue of whether belaboring an obvious point triggers excessive updating (overcompensation), by making the claim more compelling than it is. I notice that the examples in the lead post concealed the flaw in the reasoning.

Comment author: Viliam_Bur 02 February 2012 08:14:30AM *  0 points [-]

I think it's not an "either/or" situation. Examples help to understand the issue (understanding the summary requires that writer's and reader's maps are sufficiently similar), to remember it better, and help to think about one's own life (if the example is good). And yes, a bad example is bad, and a long example is inherently bad, unless author has great storytelling talent.

Comment author: steven0461 31 January 2012 10:23:05PM *  5 points [-]

Thanks! (Also to cousin_it and Tordmor.)

This point applies even in some situations where there's no uncertainty about utility, right? If you're indifferent between soup and steak, you'd prefer either to a random choice of them, because then you'd know whether to get a spoon or a knife.

Comment author: cousin_it 31 January 2012 09:37:07PM *  3 points [-]

My tentative summary is "risk aversion can help avoid gambler's ruin", but maybe I've missed something.

Comment author: orthonormal 01 February 2012 03:21:49AM 5 points [-]

he's risking his absolute nightmare, his equivalent of the worst pain possible : no more paperclips, not even a single tiny one

You ruin the thought experiment by writing as if the difference between 0 and 1 paperclips is much worse than the difference between 1 and 4, in violation of the utility function you declared.

Comment author: kilobug 01 February 2012 11:36:08AM 2 points [-]

That was a pun to add some life to the experiment and make it less "boring" or "cold", but you're right that is just false. I changed it for a much lighter wording.

Comment author: Solvent 01 February 2012 11:22:16AM 1 point [-]

Yeah, the article would be far better if that were fixed. Utility functions are identical if you add any constant to them, or any positive multiplier.

Comment author: Dan_Moore 01 February 2012 04:49:09PM 0 points [-]

Assuming Clippy can't owe paperclips (i.e., own a negative number), 0 paperclips minimizes his utility. So, I think the original sentence was OK, albeit overwrought.

Comment author: wedrifid 02 February 2012 02:36:22AM 5 points [-]

Assuming Clippy can't owe paperclips (i.e., own a negative number), 0 paperclips minimizes his utility. So, I think the original sentence was OK, albeit overwrought.

Orthonormal is correct based on the very nature of 'utility'. The difference between 1 and 4 utility is three times worse than the difference between 0 and 1. Every single time and no matter what. That '0' or a minimum is involved is irrelevant. You cannot get diminishing marginal utility on utility itself.

The place for assigning extra importance to low values and zero is in the paperclip to utility translation and not the evaluation of how much you care about the utility.

Comment author: Drahflow 03 February 2012 11:17:38PM 1 point [-]

The described effect seems strongly related to the concept of opportunity cost.

I.e. while a bet of yours is still open, the resources spent paying for entering the bet cannot be used again to enter a (better) bet.

Comment author: twanvl 02 February 2012 10:44:34PM 1 point [-]

He did take B1, and won it. He has 27 clips. If he doesn't take B2, he'll stay with 27 clips, and u=4.10.

This is wrong, since sqrt(27) = 5.20, not 4.10. The latter is the expected utility before winning B1. This doesn't seem to affect the rest of the article, though.

Comment author: kilobug 05 February 2012 11:58:02AM 0 points [-]

You're right, fixed.

Comment author: tut 02 February 2012 04:58:11PM *  2 points [-]

I liked this argument and I have upvoted your summary below. I will upvote the post as well, if you replace the word "utilitarian" with the phrase "utility maximizer". A utility maximizer is an agent that acts (as though) so as to maximize a utility function, which is what you appear to mean. A utilitarian is a person with an ethical outlook which tells them to maximize the sum of the utility of all persons (with different flavors of utilitarianism using different definitions of "utility", "sum" and "person").

Comment author: kilobug 03 February 2012 09:14:09AM 1 point [-]

The context made it clear, but better be precise, you're right, so I fixed it.

Comment author: tut 04 February 2012 05:58:15PM 1 point [-]

Thank you. Upvoted.

Comment author: Luke_A_Somers 01 February 2012 03:34:09PM 4 points [-]

ultimate pain of not having any paperclip.

I cried.

Comment author: Dmytry 01 February 2012 07:27:29AM *  5 points [-]

Excellent article.

Other issue is that even if bet outcome will be immediately available, to calculate the expected utilities of $ amounts you need to consider all possible future bets over two outcomes. E.g. if you have $100 at hand, the pay day (for your daily job) is in 2 days, and you expect a bet tomorrow where you can put on the table $110 and have a 10% chance of winning $1 million, and the only way to get extra $10 for the tomorrow's game is to play the betting game, you may have to enter a bet where you win $10 with probability 0.5 or lose $100 (the loss of $100 will make you go broke for 2 days which is entirely tolerable). Or the bet tomorrow will only need $100 , in which case you absolutely don't want to enter this $10 to $100 bet.

Bottom line is that it gets really complicated and recursive if the betting is to repeat, EVEN if the bet results are immediately available, and there can be all sorts of heuristics that will work better, possibly a lot better, than some straightforward calculation of expected utility (especially so if you can't calculate utility accurately anyway and haven't the faintest idea what the probabilities are, which is usually the case in the real world).

With regards to risk aversion in humans, it is not universal and there is a lot of people who are slightly anti-averse to the risks (gamblers).

Comment author: mfb 05 February 2012 08:30:08PM 1 point [-]

That is one point I noticed at the first scenario as well. If there is only B1, Clippy will accept it. But if Clippy knows about both bets before deciding, he will reject B1 and take B2, for an expected utility of 4.19 instead of (2+5.57)/2 = 3.78.

When offered B1, Clippy might try to predict future bets and include that in the utility calculations. I expect (but do not have anything except intuition), that a bit of risk-aversion (for B1 only) is a good strategy for a large range of "expected bet probability density functions".

Gamblers need some superlinear utility function for money (which is unlikely), have to assign a positive utility for the gained time where they don't know whether they will win (which is likely) or just act irrational (which is nearly certain).

Comment author: Dmytry 05 February 2012 09:24:31PM *  3 points [-]

The money themselves have very low utility. The items you can buy with money can have very high utilities in comparison. When you think in terms of items you want to buy, the utility function becomes not just non linear but jagged, with steps and plateaus. It grows monotonously, that's the only thing about it (and even then in sufficiently stupid jurisdiction it does not even grow monotonously due to taxation)

The heuristics for dealing with such messy function which you know is messy but can't calculate, are going to be weird.

Comment author: mfb 06 February 2012 08:11:30PM *  1 point [-]

The money themselves have very low utility.

In terms of "Hey, I have this nice-looking coin", of course. But as you can buy items with money (and you know it), money has a comparable utility.

Comment author: Dmytry 06 February 2012 10:58:35PM *  1 point [-]

Why you'd buy an item you need for survival if money themselves had comparable utility to that item? Why you'd exchange $3 for toothbrush if you need a toothbrush and not otherwise?

The survival has extremely high utility, and costs, I dunno, a few $ a day for absolute bare minimum edit: for healthy individual that's it, but up to millions for sick.

I'll call the agent that assigns utility of items that can be bought with money, to the money themselves, the miserly knight . It's miserly because it is very under-motivated to purchase anything as the purchase, for that agent, doesn't increase the utility much if at all, even if the item is essential and the agent has enough money to live a great life.

Comment author: mfb 11 February 2012 02:01:40PM *  0 points [-]

A bottle of water can save my life if I am in desert. However, as long as I am not there (and there is no desert within 1000km from my home), I don't carry a bottle of water with me every time. In the desert, a bottle of water can have a very high utility. If you know you will go in one, buy a bottle of water. But standing in front of the shop (knowing you will be in a desert soon), the utility of a bottle of water is just a bit higher than the utility of the money to buy one (because you save the time to do so).

Scenario: Let's remove the time to buy a bottle, we are right next to a vending machine for water. A bottle costs 2€ there and the machine works perfectly. You know that you will be in a desert soon, the vending machine is you only way to get water and you are highly confident that you need one bottle to survive, and do not need more than one.

Example 1: You bought a bottle of water. Would you sell me this bottle for 5€? I am quite sure it would be the rational thing to do so and to buy a new bottle. This means that at the moment, 5€ have a utility which is a bit higher than a bottle of water.

Example 2: You don't have money and water. In this situation, 2€ have a really high utility for you - you would give me a lot to get 2€, similar to stuff you would give for one bottle of water.

Note that your utility for both money and water is highly nonlinear here. The second bottle of water could be useful, but it does not save your life. With 2€ (or a bottle of water), you would not accept any (reasonable) gamble with a risk of losing.

If you do not assign equivalent utility to money, you should rush to some shops and sell all your money.

Comment author: Dmytry 11 February 2012 02:04:26PM *  2 points [-]

"If you do not assign equivalent utility to money, you should rush to some shops and sell all your money."

If I knew with certainty what items i would need in the future and if the items were non perishable nor if i could obtain more items at later date. Meanwhile I should run to the bank to sell all my money and purchase the stocks and the like. Indeed most people do that.

Once again, what is it that makes you exchange the 2€ for the bottle if you go to the desert, but not if you don't go to the desert? When you aren't going to the desert you may be very unwilling to spend 2 euro on a bottle, going to desert you may be very willing to spend 2 euro on a bottle. The willingness would not be consistent with a small difference in utilities of 2 euros and a bottle.

Selling the bottle for 5euro, you are only willing to do so because you can then buy bottle again for 2 euro AND because you can use the 3 euro on something else that is actually useful. Take out either part, and you no longer should be interested in the exchange.

To utility driven agent that can exchange items, each item has it's intrinsic utility, and it's exchange utility, and those are separate values, and the exchange utility is determined in terms of intrinsic utilities of things it can be ultimately exchanged with (via various exchange chains). If agent doesn't distinguish between those utilities, the agent is simply not going to work correctly, ending up in a circular loop of utility updates when trying to calculate utility of anything. Agent must keep track where the parts of utilities are coming from to avoid circular loops. The vast majority of the utility of money comes from utilities of some items to be bought with the money.

Comment author: mfb 11 February 2012 03:18:58PM 0 points [-]

The willingness would not be consistent with a small difference in utilities of 2 euros and a bottle.

The expectations about the future influence the utility of money. If an oracle tells me "tomorrow you'll need 1€ more than you currently have to save your life" and I know that it is usually right, this 1€ more has a high utility for me and I will really try to get it. Without that oracle, it is just 1€ more or less.

The vast majority of the utility of money comes from utilities of some items to be bought with the money.

This is the only (interesting) utility. So what? As long as you always have the option to exchange money for something useful (or at least know that you will have the possibility when you need it), you are indifferent between getting the item or money to buy the item. A good utility function should assign the same utility for both in that case.

Comment author: Dmytry 11 February 2012 11:25:41PM *  1 point [-]

As long as you always have the option to exchange money for something useful (or at least know that you will have the possibility when you need it), you are indifferent between getting the item or money to buy the item.

I don't think this works correctly... your agent has money and is standing next to a soda machine, getting progressively more thirsty. How does that trigger the exchange if you tie utility of money to utility of water to drink?

The thing to remember is that utility is what you use to push your utility maximizing agent around, to make the agent do things. The agent has some foresight of how it's actions are going to affect future utility, and picks the actions that result in larger future utility. For human, the utility may be some integral of quality of life over the possible futures. Drinking water when thirsty, eating food when hungry, living in a better house vs living in a worse house, that sort of stuff. Use, not passive possession. If the agent has descendants, or cares for the mankind, then the integral may include other people.

If you include possession of money over time into this integral, your agent will make trade-offs between possession of the money and the ultimate total comfort of his life (and those he cares about). The agent's behaviour will be that of a miser.

edit: and yes, as a heuristic, you can assign derived utilities to stuff that can be exchanged for usable stuff. But you can't include those utilities in the sum over the original ones. The derived utilities are a shortcut, a heuristic, and one needs to be careful not to sum together utilities of different types.

edit: to think about it, though, maybe a great deal of people have the kind of utility that you described here, the utility calculated from possession of items. I, personally, try to act according to the utility as I described above, calculated from the uses and comfort of life.

Comment author: TheOtherDave 12 February 2012 05:17:18PM *  1 point [-]

to think about it, though, maybe a great deal of people have the kind of utility that you described here, the utility calculated from possession of items.

I think this is a key point. Our brains seem prone to a dynamic where we assign some attribute to our representation of a thing in a way that makes sense in the short term (e.g., valuing money), but we then fail to entirely re-initialize that assignment when we're done doing whatever we were doing, so over time our instrumental goals take on a terminal value of their own. Theories involving clear-cut lines between terminal and instrumental goals consequently don't describe actual human behavior terribly well.

Comment author: mfb 15 February 2012 12:45:40PM 0 points [-]

your agent has money and is standing next to a soda machine, getting progressively more thirsty. How does that trigger the exchange if you tie utility of money to utility of water to drink? In that case, he knows that at some time he has to buy water and a part of his money is equivalent to the soda. He is indifferent between buying it now and later (again assuming the machine works perfectly). When he begins to become thirsty, the utility of the soda increases and he will buy one.

If you don't assign utility to money, you have to violate one of the rules given here: http://en.wikipedia.org/wiki/Utility#Additive_von_Neumann.E2.80.93Morgenstern_utility - or you have to take choices which do not maximize your utility.

But maybe there is an additional thing to consider: Time. Money is an option to get stuff in the future. Having money as an intermediate step is useful to have food/house/whatever later.

How can we define a single number for a utility, if we want to have food all the time? Maybe like this: Define a current utility density. Integrate the (expected) utility density over the lifetime. In that case, having money increases the expected utility density in the future, and therefore the total utility.

Comment author: linas 09 March 2013 04:01:53AM 0 points [-]

There needs to be an exploration of addiction and rationality. Gamblers are addicted; we know some of the brain mechanisms of addiction -- some neurotransmitter A is released in brain region B, Causing C to deplete, causing a dependency on the reward that A provides. This particular neuro-chemical circuit derives great utility from the addiction, thus driving the behaviour. By this argument, perhaps one might argue that addicts are "rational", because they derive a great utility from their addiction. But is this argument faulty?

A mechanistic explanation of addiction says the addict has no control, no free will, no ability to break the cycle. But is it fair to say that a "machine has a utility function"? Or do you need to have free before you can discuss choice?

Comment author: Prismattic 01 February 2012 12:14:24AM *  2 points [-]

This post is forcing me to reconsider the post I was going to offer on risk-aversion, so I'll just leave it as a comment.

I suspect people will be more likely to act as if they understand expected payout if the situation appears as an iterated game. Consider the following two situations:

  1. I offer you the choice of pushing a button or not. Pushing the button will give you $100,000 dollars with probability 0.9 and take from you $500,000 (or everything you own) with probability 0.1. This is a one-time deal. (Alternatively, the button either adds 5 years to your life with probability 0.9 or takes away 25 years with probability 0.1).

  2. I offer you the same deal, except you know that you will the opportunity to push the button or not once per year every year for 10 years.

My hypothesis is that many more people would choose to push the button in scenario 2, and the would explicitly be doing something like calculating expected value over time. Whereas in situation 1, many will shy away from taking the risk, because being ruined permanently is a lot worse than gaining $100,000 once (which is basically risk aversion in the economic sense -- money does not have a linear relationship with utility, nor do years of life).

Comment author: Luke_A_Somers 01 February 2012 03:56:11PM 0 points [-]

You have to last that year. With the 25 years taken away, that's explicitly potentially false. With the money gone, that's a scary prospect too.

Comment author: Armok_GoB 31 January 2012 11:50:50PM 10 points [-]

This is extremely well written, to the point of making me go Wow. It has a funny example. It has a real world ancestral example. It has a piece of nifty code that lets you test things yourself. More articles should be written this way!

Comment author: Luke_A_Somers 01 February 2012 03:33:48PM 4 points [-]

It has a real world ancestral example.

More realistic fictional example.

Comment author: [deleted] 31 January 2012 10:04:01PM 11 points [-]

Summary of the article:

Whenever you take a risk in real life the outcome will take time to manifest. Until then some of the desicions you will have to face might depend on that outcome which will increase the computational power needed to calculate expected utiltiy and make the desicion. Since our computational power is limited it might be rational to keep these complications at a minimum, therfore risk aversion.

Comment author: Viliam_Bur 01 February 2012 11:39:10AM *  6 points [-]

There is one important detail about the computational power. Risk today increases computational power for evaluating tomorrow's risk. But also the possibility of risks tomorrow becoming entangled with today's decision increases computational power for evaluating today's risk.

The real utility of today's risk is probably smaller than it seems when we evaluate the risk as an isolated event. Risk aversion may be a heuristic to improve our estimates.

Comment author: Solvent 01 February 2012 11:20:38AM 1 point [-]

Can we put that at the top of the article, please?

Comment author: kilobug 01 February 2012 11:32:30AM 2 points [-]

I added my own formulation on it (in my comment below), because it's not just the computing power, but also the loss of knowledge - even with lots of computing power, there is a cost (and yes it's higher for us humans with limited computing power).

But yes I should have put the summary/abstract on top, and I know did, thanks for the suggestion (from you and others).

Comment author: thelittledoctor 01 February 2012 12:04:12AM 0 points [-]

This is off the top of my head, so I apologize if it ends up being ill-conceived:

Imagine we take a lottery with 50% odds of winning (W) or losing (L), where W gives us lots of utility and L gives us very little or none (or negative!). But we don't find out for a couple weeks whether we won or not, so until we find out all of our decisions become more complex - we have to plan for both case W and case L. Since we have two possible cases with equal probability, this (at maximum) doubles the amount of planning we have to do - it adds one bit to the computational complexity of our plans. If we have ten million free bits of capacity, that's no big deal, but if we only have five bits, that's a pretty big chunk - it substantially decreases our ability to optimize. So then we should be able to plot the marginal utility of gaining or losing one bit of computational capacity and plug it in as a term in our overall utility function.

Did that make any sense, or have I just gone crazy?

Comment author: thomblake 31 January 2012 09:53:09PM 0 points [-]

This post could use some proofreading and cleanup.

Comment author: kilobug 31 January 2012 10:04:37PM 0 points [-]

I'm open to any suggestion, any part in particular you would like to clean or any correction you would like to me ?

Comment author: dbaupp 01 February 2012 05:17:01AM 0 points [-]

If you didn't like my Clippy, despite him being cute, and purring of happiness when he sees paperclips, lets shift to another late.

tale?

Comment author: kilobug 01 February 2012 09:02:05AM 0 points [-]

Fixed, thanks.

Comment author: Antisuji 02 February 2012 01:27:46AM *  2 points [-]

It's actually "let's shift".

Also:

  • "a 10-sided dice" should be "a 10-sided die"
  • "If the coin comes head, we give him 18 paperclips. If it comes tail, we give him nothing." should be "If the coin comes up heads, we give him 18 paperclips. If it comes up tails, we give him nothing."
  • "he's attracted by Dors" is more conventionally "he's attracted to Dors"

Great article overall!

Comment author: kilobug 02 February 2012 09:38:50AM 0 points [-]

Thanks, I fixed those issues. (Sorry, I'm not a native english speaker...)