Beware of Other-Optimizing

79 Eliezer_Yudkowsky 10 April 2009 01:58AM

Previously in seriesMandatory Secret Identities

I've noticed a serious problem in which aspiring rationalists vastly overestimate their ability to optimize other people's lives.  And I think I have some idea of how the problem arises.

You read nineteen different webpages advising you about personal improvement—productivity, dieting, saving money.  And the writers all sound bright and enthusiastic about Their Method, they tell tales of how it worked for them and promise amazing results...

But most of the advice rings so false as to not even seem worth considering.  So you sigh, mournfully pondering the wild, childish enthusiasm that people can seem to work up for just about anything, no matter how silly.  Pieces of advice #4 and #15 sound interesting, and you try them, but... they don't... quite... well, it fails miserably.  The advice was wrong, or you couldn't do it, and either way you're not any better off.

And then you read the twentieth piece of advice—or even more, you discover a twentieth method that wasn't in any of the pages—and STARS ABOVE IT ACTUALLY WORKS THIS TIME.

At long, long last you have discovered the real way, the right way, the way that actually works.  And when someone else gets into the sort of trouble you used to have—well, this time you know how to help them.  You can save them all the trouble of reading through nineteen useless pieces of advice and skip directly to the correct answer.  As an aspiring rationalist you've already learned that most people don't listen, and you usually don't bother—but this person is a friend, someone you know, someone you trust and respect to listen.

And so you put a comradely hand on their shoulder, look them straight in the eyes, and tell them how to do it.

continue reading »

A Voting Puzzle, Some Political Science, and a Nerd Failure Mode

88 ChrisHallquist 10 October 2013 02:10AM

In grade school, I read a series of books titled Sideways Stories from Wayside School by Louis Sachar, who you may know as the author of the novel Holes which was made into a movie in 2003. The series included two books of math problems, Sideways Arithmetic from Wayside School and More Sideways Arithmetic from Wayside School, the latter of which included the following problem (paraphrased):

The students have Mrs. Jewl's class have been given the privilege of voting on the height of the school's new flagpole. She has each of them write down what they think would be the best hight for the flagpole. The votes are distributed as follows:

  • 1 student votes for 6 feet.
  • 1 student votes for 10 feet.
  • 7 students vote for 25 feet.
  • 1 student votes for 30 feet.
  • 2 students vote for 50 feet.
  • 2 students vote for 60 feet.
  • 1 student votes for 65 feet.
  • 3 students vote for 75 feet.
  • 1 student votes for 80 feet, 6 inches.
  • 4 students vote for 85 feet.
  • 1 student votes for 91 feet.
  • 5 students vote for 100 feet.

At first, Mrs. Jewls declares 25 feet the winning answer, but one of the students who voted for 100 feet convinces her there should be a runoff between 25 feet and 100 feet. In the runoff, each student votes for the height closest to their original answer. But after that round of voting, one of the students who voted for 85 feet wants their turn, so 85 feet goes up against the winner of the previous round of voting, and the students vote the same way, with each student voting for the height closest to their original answer. Then the same thing happens again with the 50 foot option. And so on, with each number, again and again, "very much like a game of tether ball."

Question: if this process continues until it settles on an answer that can't be beaten by any other answer, how tall will the new flagpole be?

Answer (rot13'd): fvkgl-svir srrg, orpnhfr gung'f gur zrqvna inyhr bs gur bevtvany frg bs ibgrf. Naq abj lbh xabj gur fgbel bs zl svefg rapbhagre jvgu gur zrqvna ibgre gurberz.

Why am I telling you this? There's a minor reason and a major reason. The minor reason is that this shows it is possible to explain little-known academic concepts, at least certain ones, in a way that grade schoolers will understand. It's a data point that fits nicely with what Eliezer has written about how to explain things. The major reason, though, is that a month ago I finished my systematic read-through of the sequences and while I generally agree that they're awesome (perhaps moreso than most people; I didn't see the problem with the metaethics sequence), I thought the mini-discussion of political parties and voting was on reflection weak and indicative of a broader nerd failure mode.

TLDR (courtesy of lavalamp):

  1. Politicians probably conform to the median voter's views.
  2. Most voters are not the median, so most people usually dislike the winning politicians.
  3. But people dislike the politicians for different reasons.
  4. Nerds should avoid giving advice that boils down to "behave optimally". Instead, analyze the reasons for the current failure to behave optimally and give more targeted advice.

continue reading »

Three ways CFAR has changed my view of rationality

102 Julia_Galef 10 September 2013 06:24PM

The Center for Applied Rationality's perspective on rationality is quite similar to Less Wrong's. In particular, we share many of Less Wrong's differences from what's sometimes called "traditional" rationality, such as Less Wrong's inclusion of Bayesian probability theory and the science on heuristics and biases.

But after spending the last year and a half with CFAR as we've developed, tested, and attempted to teach hundreds of different versions of rationality techniques, I've noticed that my picture of what rationality looks like has shifted somewhat from what I perceive to be the most common picture of rationality on Less Wrong. Here are three ways I think CFAR has come to see the landscape of rationality differently than Less Wrong typically does – not disagreements per se, but differences in focus or approach. (Disclaimer: I'm not speaking for the rest of CFAR here; these are my own impressions.)

 

1. We think less in terms of epistemic versus instrumental rationality.

Formally, the methods of normative epistemic versus instrumental rationality are distinct: Bayesian inference and expected utility maximization. But methods like "use Bayes' Theorem" or "maximize expected utility" are usually too abstract and high-level to be helpful for a human being trying to take manageable steps towards improving her rationality. And when you zoom in from that high-level description of rationality down to the more concrete level of "What five-second mental habits should I be training?" the distinction between epistemic and instrumental rationality becomes less helpful.

Here's an analogy: epistemic rationality is like physics, where the goal is to figure out what's true about the world, and instrumental rationality is like engineering, where the goal is to accomplish something you want as efficiently and effectively as possible. You need physics to do engineering; or I suppose you could say that doing engineering is doing physics, but with a practical goal. However, there's plenty of physics that's done for its own sake, and doesn't have obvious practical applications, at least not yet. (String theory, for example.) Similarly, you need a fair amount of epistemic rationality in order to be instrumentally rational, though there are parts of epistemic rationality that many of us practice for their own sake, and not as a means to an end. (For example, I appreciate clarifying my thinking about free will even though I don't expect it to change any of my behavior.)

In this analogy, many skills we focus on at CFAR are akin to essential math, like linear algebra or differential equations, which compose the fabric of both physics and engineering. It would be foolish to expect someone who wasn't comfortable with math to successfully calculate a planet's trajectory or design a bridge. And it would be similarly foolish to expect you to successfully update like a Bayesian or maximize your utility if you lacked certain underlying skills. Like, for instance: Noticing your emotional reactions, and being able to shift them if it would be useful. Doing thought experiments. Noticing and overcoming learned helplessness. Visualizing in concrete detail. Preventing yourself from flinching away from a thought. Rewarding yourself for mental habits you want to reinforce. 

These and other building blocks of rationality are essential both for reaching truer beliefs, and for getting what you value; they don't fall cleanly into either an "epistemic" or an "instrumental" category. Which is why, when I consider what pieces of rationality CFAR should be developing, I've been thinking less in terms of "How can we be more epistemically rational?" or "How can we be more instrumentally rational?" and instead using queries like, "How can we be more metacognitive?"

 

2. We think more in terms of a modular mind.

The human mind isn't one coordinated, unified agent, but rather a collection of different processes that often aren't working in sync, or even aware of what each other is up to. Less Wrong certainly knows this; see, for example, discussions of anticipations versus professions, aliefs, and metawanting. But in general we gloss over that fact, because it's so much simpler and more natural to talk about "what I believe" or "what I want," even if technically there is no single "I" doing the believing or wanting. And for many purposes that kind of approximation is fine. 

But a rationality-for-humans usually can't rely on that shorthand. Any attempt to change what "I" believe, or optimize for what "I" want, forces a confrontation of the fact that there are multiple, contradictory things that could reasonably be called "beliefs," or "wants," coexisting in the same mind. So a large part of applied rationality turns out to be about noticing those contradictions and trying to achieve coherence, in some fashion, before you can even begin to update on evidence or plan an action.

Many of the techniques we're developing at CFAR fall roughly into the template of coordinating between your two systems of cognition: implicit-reasoning System 1 and explicit-reasoning System 2. For example, knowing when each system is more likely to be reliable. Or knowing how to get System 2 to convince System 1 of something ("We're not going to die if we go talk to that stranger"). Or knowing what kinds of questions System 2 should ask of System 1 to find out why it's uneasy about the conclusion at which System 2 has arrived.

This is all, of course, with the disclaimer that the anthropomorphizing of the systems of cognition, and imagining them talking to each other, is merely a useful metaphor. Even the classification of human cognition into Systems 1 and 2 is probably not strictly true, but it's true enough to be useful. And other metaphors prove useful as well – for example, some difficulties with what feels like akrasia become more tractable when you model your future selves as different entities, as we do in the current version of our "Delegating to yourself" class.

 

3. We're more focused on emotions.

There's relatively little discussion of emotions on Less Wrong, but they occupy a central place in CFAR's curriculum and organizational culture.

It used to frustrate me when people would say something that revealed they held a Straw Vulcan-esque belief that "rationalist = emotionless robot". But now when I encounter that misconception, it just makes me want to smile, because I'm thinking to myself: "If you had any idea how much time we spend at CFAR talking about our feelings…"

Being able to put yourself into particular emotional states seems to make a lot of pieces of rationality easier. For example, for most of us, it's instrumentally rational to explore a wider set of possible actions – different ways of studying, holding conversations, trying to be happy, and so on – beyond whatever our defaults happen to be. And for most of us, inertia and aversions get in the way of that exploration. But getting yourself into "playful" mode (one of the hypothesized primary emotional circuits common across mammals) can make it easier to branch out into a wider swath of Possible-Action Space. Similarly, being able to call up a feeling of curiosity or of "seeking" (another candidate for a primary emotional circuit) can help you conquer motivated cognition and learned blankness.  

And simply being able to notice your emotional state is rarer and more valuable than most people realize. For example, if you're in fight-or-flight mode, you're going to feel more compelled to reject arguments that feel like a challenge to your identity. Being attuned to the signs of sympathetic nervous system activation – that you're tensing up, or that your heart rate is increasing – means you get cues to double-check your reasoning, or to coax yourself into another emotional state.

We also use emotions as sources of data. You can learn to tap into feelings of surprise or confusion to get a sense of how probable you implicitly expect some event to be. Or practice simulating hypotheticals ("What if I knew that my novel would never sell well?") and observing your resultant emotions, to get a clearer picture of your utility function. 

And emotions-as-data can be a valuable check on your System 2's conclusions. One of our standard classes is "Goal Factoring," which entails finding some alternate set of actions through which you can purchase the goods you want more cheaply. So you might reason, "I'm doing martial arts for the exercise and self-defense benefits... but I could purchase both of those things for less time investment by jogging to work and carrying Mace." If you listened to your emotional reaction to that proposal, however, you might notice you still feel sad about giving up martial arts even if you were getting the same amount of exercise and self-defense benefits somehow else.

Which probably means you've got other reasons for doing martial arts that you haven't yet explicitly acknowledged -- for example, maybe you just think it's cool. If so, that's important, and deserves a place in your decisionmaking. Listening for those emotional cues that your explicit reasoning has missed something is a crucial step, and to the extent that aspiring rationalists sometimes forget it, I suppose that's a Steel-Manned Straw Vulcan (Steel Vulcan?) that actually is worth worrying about.

Conclusion

I'll name one more trait that unites, rather than divides, CFAR and Less Wrong. We both diverge from "traditional" rationality in that we're concerned with determining which general methods systematically perform well, rather than defending some set of methods as "rational" on a priori criteria alone. So CFAR's picture of what rationality looks like, and how to become more rational, will and should change over the coming years as we learn more about the effects of our rationality training efforts. 

The Robots, AI, and Unemployment Anti-FAQ

47 Eliezer_Yudkowsky 25 July 2013 06:46PM

Q.  Are the current high levels of unemployment being caused by advances in Artificial Intelligence automating away human jobs?

A.  Conventional economic theory says this shouldn't happen.  Suppose it costs 2 units of labor to produce a hot dog and 1 unit of labor to produce a bun, and that 30 units of labor are producing 10 hot dogs in 10 buns.  If automation makes it possible to produce a hot dog using 1 unit of labor instead, conventional economics says that some people should shift from making hot dogs to buns, and the new equilibrium should be 15 hot dogs in 15 buns.  On standard economic theory, improved productivity - including from automating away some jobs - should produce increased standards of living, not long-term unemployment.

Q.  Sounds like a lovely theory.  As the proverb goes, the tragedy of science is a beautiful theory slain by an ugly fact.  Experiment trumps theory and in reality, unemployment is rising.

A.  Sure.  Except that the happy equilibrium with 15 hot dogs in buns, is exactly what happened over the last four centuries where we went from 95% of the population being farmers to 2% of the population being farmers (in agriculturally self-sufficient developed countries).  We don't live in a world where 93% of the people are unemployed because 93% of the jobs went away.  The first thought of automation removing a job, and thus the economy having one fewer job, has not been the way the world has worked since the Industrial Revolution.  The parable of the hot dog in the bun is how economies really, actually worked in real life for centuries.  Automation followed by re-employment went on for literally centuries in exactly the way that the standard lovely economic model said it should.  The idea that there's a limited amount of work which is destroyed by automation is known in economics as the "lump of labour fallacy".

Q.  But now people aren't being reemployed.  The jobs that went away in the Great Recession aren't coming back, even as the stock market and corporate profits rise again.

A.  Yes.  And that's a new problem.  We didn't get that when the Model T automobile mechanized the entire horse-and-buggy industry out of existence.  The difficulty with supposing that automation is producing unemployment is that automation isn't new, so how can you use it to explain this new phenomenon of increasing long-term unemployment?

Baxter robot

continue reading »

Philosophical Landmines

84 [deleted] 08 February 2013 09:22PM

Related: Cached Thoughts

Last summer I was talking to my sister about something. I don't remember the details, but I invoked the concept of "truth", or "reality" or some such. She immediately spit out a cached reply along the lines of "But how can you really say what's true?".

Of course I'd learned some great replies to that sort of question right here on LW, so I did my best to sort her out, but everything I said invoked more confused slogans and cached thoughts. I realized the battle was lost. Worse, I realized she'd stopped thinking. Later, I realized I'd stopped thinking too.

I went away and formulated the concept of a "Philosophical Landmine".

I used to occasionally remark that if you care about what happens, you should think about what will happen as a result of possible actions. This is basically a slam dunk in everyday practical rationality, except that I would sometimes describe it as "consequentialism".

The predictable consequence of this sort of statement is that someone starts going off about hospitals and terrorists and organs and moral philosophy and consent and rights and so on. This may be controversial, but I would say that causing this tangent constitutes a failure to communicate the point. Instead of prompting someone to think, I invoked some irrelevant philosophical cruft. The discussion is now about Consequentialism, the Capitalized Moral Theory, instead of the simple idea of thinking through consequences as an everyday heuristic.

It's not even that my statement relied on a misused term or something; it's that an unimportant choice of terminology dragged the whole conversation in an irrelevant and useless direction.

That is, "consequentialism" was a Philosophical Landmine.

In the course of normal conversation, you passed through an ordinary spot that happened to conceal the dangerous leftovers of past memetic wars. As a result, an intelligent and reasonable human was reduced to a mindless zombie chanting prerecorded slogans. If you're lucky, that's all. If not, you start chanting counter-slogans and the whole thing goes supercritical.

It's usually not so bad, and no one is literally "chanting slogans". There may even be some original phrasings involved. But the conversation has been derailed.

So how do these "philosophical landmine" things work?

It looks like when a lot has been said on a confusing topic, usually something in philosophy, there is a large complex of slogans and counter-slogans installed as cached thoughts around it. Certain words or concepts will trigger these cached thoughts, and any attempt to mitigate the damage will trigger more of them. Of course they will also trigger cached thoughts in other people, which in turn... The result being that the conversation rapidly diverges from the original point to some useless yet heavily discussed attractor.

Notice that whether a particular concept will cause trouble depends on the person as well as the concept. Notice further that this implies that the probability of hitting a landmine scales with the number of people involved and the topic-breadth of the conversation.

Anyone who hangs out on 4chan can confirm that this is the approximate shape of most thread derailments.

Most concepts in philosophy and metaphysics are landmines for many people. The phenomenon also occurs in politics and other tribal/ideological disputes. The ones I'm particularly interested in are the ones in philosophy, but it might be useful to divorce the concept of "conceptual landmines" from philosophy in particular.

Here's some common ones in philosophy:

  • Morality
  • Consequentialism
  • Truth
  • Reality
  • Consciousness
  • Rationality
  • Quantum

Landmines in a topic make it really hard to discuss ideas or do work in these fields, because chances are, someone is going to step on one, and then there will be a big noisy mess that interferes with the rather delicate business of thinking carefully about confusing ideas.

My purpose in bringing this up is mostly to precipitate some terminology and a concept around this phenomenon, so that we can talk about it and refer to it. It is important for concepts to have verbal handles, you see.

That said, I'll finish with a few words about what we can do about it. There are two major forks of the anti-landmine strategy: avoidance, and damage control.

Avoiding landmines is your job. If it is a predictable consequence that something you could say will put people in mindless slogan-playback-mode, don't say it. If something you say makes people go off on a spiral of bad philosophy, don't get annoyed with them, just fix what you say. This is just being a communications consequentialist. Figure out which concepts are landmines for which people, and step around them, or use alternate terminology with fewer problematic connotations.

If it happens, which it does, as far as I can tell, my only effective damage control strategy is to abort the conversation. I'll probably think that I can take those stupid ideas here and now, but that's just the landmine trying to go supercritical. Just say no. Of course letting on that you think you've stepped on a landmine is probably incredibly rude; keep it to yourself. Subtly change the subject or rephrase your original point without the problematic concepts or something.

A third prong could be playing "philosophical bomb squad", which means permanently defusing landmines by supplying satisfactory nonconfusing explanations of things without causing too many explosions in the process. Needless to say, this is quite hard. I think we do a pretty good job of it here at LW, but for topics and people not yet defused, avoid and abort.

ADDENDUM: Since I didn't make it very obvious, it's worth noting that this happens with rationalists, too, even on this very forum. It is your responsibility not to contain landmines as well as not to step on them. But you're already trying to do that, so I don't emphasize it as much as not stepping on them.

Pinpointing Utility

57 [deleted] 01 February 2013 03:58AM

Following Morality is Awesome. Related: Logical Pinpointing, VNM.

The eternal question, with a quantitative edge: A wizard has turned you into a whale, how awesome is this?

"10.3 Awesomes"

Meditate on this: What does that mean? Does that mean it's desirable? What does that tell us about how awesome it is to be turned into a whale? Explain. Take a crack at it for real. What does it mean for something to be labeled as a certain amount of "awesome" or "good" or "utility"?

What is This Utility Stuff?

Most of agree that the VNM axioms are reasonable, and that they imply that we should be maximizing this stuff called "expected utility". We know that expectation is just a weighted average, but what's this "utility" stuff?

Well, to start with, it's a logical concept, which means we need to pin it down with the axioms that define it. For the moment, I'm going to conflate utility and expected utility for simplicity's sake. Bear with me. Here are the conditions that are necessary and sufficient to be talking about utility:

  1. Utility can be represented as a single real number.
  2. Each outcome has a utility.
  3. The utility of a probability distribution over outcomes is the expected utility.
  4. The action that results in the highest utility is preferred.
  5. No other operations are defined.

I hope that wasn't too esoteric. The rest of this post will be explaining the implications of those statements. Let's see how they apply to the awesomeness of being turned into a whale:

  1. "10.3 Awesomes" is a real number.
  2. We are talking about the outcome where "A wizard has turned you into a whale".
  3. There are no other outcomes to aggregate with, but that's OK.
  4. There are no actions under consideration, but that's OK.
  5. Oh. Not even taking the value?

Note 5 especially. You can probably look at the number without causing trouble, but if you try to treat it as meaningful for something other than condition 3 and 4, even accidentally, that's a type error.

Unfortunately, you do not have a finicky compiler that will halt and warn you if you break the rules. Instead, your error will be silently ignored, and you will go on, blissfully unaware that the invariants in your decision system no longer pinpoint VNM utility. (Uh oh.)

Unshielded Utilities, and Cautions for Utility-Users

Let's imagine that utilities are radioactive; If we are careful with out containment procedures, we can safely combine and compare them, but if we interact with an unshielded utility, it's over, we've committed a type error.

To even get a utility to manifest itself in this plane, we have to do a little ritual. We have to take the ratio between two utility differences. For example, if we want to get a number for the utility of being turned into a whale for a day, we might take the difference between that scenario and what we would otherwise expect to do, and then take the ratio between that difference and the difference between a normal day and a day where we also get a tasty sandwich. (Make sure you take the absolute value of your unit, or you will reverse your utility function, which is a bad idea.)

So the form that the utility of being a whale manifests as might be "500 tasty sandwiches better than a normal day". We have chosen "a normal day" for our datum, and "tasty sandwiches" for our units. Of course we could have just as easily chosen something else, like "being turned into a whale" as our datum, and "orgasms" for our units. Then it would be "0 orgasms better than being turned into a whale", and a normal day would be "-400 orgasms from the whale-day".

You say: "But you shouldn't define your utility like that, because then you are experiencing huge disutility in the normal case."

Wrong, and radiation poisoning, and type error. You tried to "experience" a utility, which is not in the defined operations. Also, you looked directly at the value of an unshielded utility (also known as numerology).

We summoned the utilities into the real numbers, but they are still utilities, and we still can only compare and aggregate them. The summoning only gives us a number that we can numerically do those operations on, which is why we did it. This is the same situation as time, position, velocity, etc, where we have to select units and datums to get actual quantities that mathematically behave like their ideal counterparts.

Sometimes people refer to this relativity of utilities as "positive affine structure" or "invariant up to a scale and shift", which confuses me by making me think of an equivalence class of utility functions with numbers coming out, which don't agree on the actual numbers, but can be made to agree with a linear transform, rather than making me think of a utility function as a space I can measure distances in. I'm an engineer, not a mathematician, so I find it much more intuitive and less confusing to think of it in terms of units and datums, even though it's basically the same thing. This way, the utility function can scale and shift all it wants, and my numbers will always be the same. Equivalently, all agents that share my preferences will always agree that a day as a whale is "400 orgasms better than a normal day", even if they use another basis themselves.

So what does it mean that being a whale for a day is 400 orgasms better than a normal day? Does it mean I would prefer 400 orgasms to a day as a whale? Nope. Orgasms don't add up like that; I'd probably be quite tired of it by 15. (remember that "orgasms" were defined as the difference between a day without an orgasm and a day with one, not as the utility of a marginal orgasm in general.) What it means is that I'd be indifferent between a normal day with a 1/400 chance of being a whale, and a normal day with guaranteed extra orgasm.

That is, utilities are fundamentally about how your preferences react to uncertainty. For example, You don't have to think that each marginal year of life is as valuable as the last, if you don't think you should take a gamble that will double your remaining lifespan with 60% certainty and kill you otherwise. After all, all that such a utility assignment even means is that you would take such a gamble. In the words of VNM:

We have practically defined numerical utility as being that thing for which the calculus of mathematical expectations is legitimate.

But suppose there are very good arguments that have nothing to do with uncertainty for why you should value each marginal life-year as much as the last. What then?

Well, "what then" is that we spend a few weeks in the hospital dying of radiation poisoning, because we tried to interact with an unshielded utility again (utilities are radioactive, remember? The specific error is that we tried to manipulate the utility function with something other than comparison and aggregation. Touching a utility directly is just as much an error as observing it directly.

But if the only way to define your utility function is with thought experiments about what gambles you would take, and the only use for it is deciding what gambles you would take, then isn't it doing no work as a concept?

The answer is no, but this is a good question because it gets us closer to what exactly this utility function stuff is about. The utility of utility is that defining how you would behave in one gamble puts a constraint on how you would behave in some other related gambles. As with all math, we put in some known facts, and then use the rules to derive some interesting but unknown facts.

For example, if we have decided that we would be indifferent between a tasty sandwich and a 1/500 chance of being a whale for tomorrow, and that we'd be indifferent between a tasty sandwich and a 30% chance of sun instead of the usual rain, then we should also be indifferent between a certain sunny day and a 1/150 chance of being a whale.

Monolithicness and Marginal (In)Dependence

If you are really paying attention, you may be a bit confused, because it seems to you that money or time or some other consumable resource can force you to assign utilities even if there is no uncertainty in the system. That issue is complex enough to deserve its own post, so I'd like to delay it for now.

Part of the solution is that as we defined them, utilities are monolithic. This is the implication of "each outcome has a utility". What this means is that you can't add and recombine utilities by decomposing and recombining outcomes. Being specific, you can't take a marginal whale from one outcome and staple it onto another outcome, and expect the marginal utilities to be the same. For example, maybe the other outcome has no oceans for your marginal whale.

For a bigger example, what we have said so far about the relative value of sandwiches and sunny days and whale-days does not necessarily imply that we are indifferent between a 1/250 chance of being a whale and any of the following:

  • A day with two tasty sandwiches. (Remember that a tasty sandwich was defined as a specific difference, not a marginal sandwich in general, which has no reason to have a consistent marginal value.)

  • A day with a 30% chance of sun and a certain tasty sandwich. (Maybe the tasty sandwich and the sun at the same time is horrifying for some reason. Maybe someone drilled into you as a child that "bread in the sun" was bad bad bad.)

  • etc. You get the idea. Utilities are monolithic and fundamentally associated with particular outcomes, not marginal outcome-pieces.

However, as in probability theory, where each possible outcome technically has its very own probability, in practice it is useful to talk about a concept of independence.

So for example, even though the axioms don't guarantee in general that it will ever be the case, it may work out in practice that given some conditions, like there being nothing special about bread in the sun, and my happiness not being near saturation, the utility of a marginal tasty sandwich is independent of a marginal sunny day, meaning that sun+sandwich is as much better than just sun as just a sandwich is better than baseline, ultimately meaning that I am indifferent between {50%: sunny+sandwich; 50% baseline} and {50%: sunny; 50%: sandwich}, and other such bets. (We need a better solution for rendering probability distributions in prose).

Notice that the independence of marginal utilities can depend on conditions and that independence is with respect to some other variable, not a general property. The utility of a marginal tasty sandwich is not independent of whether I am hungry, for example.

There is a lot more to this independence thing (and linearity, and risk aversion, and so on), so it deserves its own post. For now, the point is that the monolithicness thing is fundamental, but in practice we can sometimes look inside the black box and talk about independent marginal utilities.

Dimensionless Utility

I liked this quote from the comments of Morality is Awesome:

Morality needs a concept of awfulness as well as awesomeness. In the depths of hell, good things are not an option and therefore not a consideration, but there are still choices to be made.

Let's develop that second sentence a bit more. If all your options suck, what do you do? You still have to choose. So let's imagine we are in the depths of hell and see what our theories have to say about it:

Day 78045. Satan has presented me with three options:

  1. Go on a date with Satan Himself. This will involve romantically torturing souls together, subtly steering mortals towards self-destruction, watching people get thrown into the lake of fire, and some very unsafe, very nonconsensual sex with the Adversary himself.

  2. Paperclip the universe.

  3. Satan's court wizard will turn me into a whale and release me into the lake of fire, to roast slowly for the next month, kept alive by twisted black magic.

Wat do?

They all seem pretty bad, but "pretty bad" is not a utility. We could quantify paperclipping as a couple hundred billion lives lost. Being a whale in the lake of fire would be awful, but a bounded sort of awful. A month of endless horrible torture. The "date" is having to be on the giving end of what would more or less happen anyway, and then getting savaged by Satan. Still none of these are utilities.

Coming up with actual utility numbers for these in terms of tasty sandwiches and normal days is hard; it would be like measuring the microkelvin temperatures of your physics experiment with a Fahrenheit kitchen thermometer; in principle it might work, but it isn't the best tool for the job. Instead, we'll use a different scheme this time.

Engineers (and physicists?) sometimes transform problems into a dimensionless form that removes all redundant information from the problem. For example, for a heat conduction problem, we might define an isomorphic dimensionless temperature so that real temperatures between 78 and 305 C become dimensionless temperatures between 0 and 1. Transforming a problem into dimensionless form is nearly always helpful, often in really surprising ways. We can do this with utility too.

Back to depths of hell. The date with Satan is clearly the best option, so it gets dimensionless utility 1. The paperclipper gets 0. On that scale, I'd say roasting in the lake of fire is like 0.999 or so, but that might just be scope insensitivity. We'll take it for now.

The advantages with this approach are:

  1. The numbers are more intuitive. -5e12 QALYs, -1 QALY, and -50 QALYs from a normal day, or the equivalent in tasty sandwiches, just doesn't have the same feeling of clarity as 0, 1 and .999. (For me at least. And yes I know those numbers don't quite match.)

  2. Not having to relate the problem quantities to far-away datums or drastically misappropriate units (tasty sandwiches for this problem) makes the numbers easier and more direct to come up with. Also we have to come up with less of them. The problem is self-contained.

  3. If defined right, the connection between probability and utility becomes extra-clear. For example: What chance between a Satan-date and a paperclipper would make me indifferent with a lake-of-fire-whale-month? 0.999! Unitless magic!

  4. All confusing redundant information (like negative signs) are removed, which makes it harder to accidentally do numerology or commit a type error.

  5. All redundant information is removed, which means you find many more similarities between problems. The value of this in general cannot be understated. Just look at the generalizations made about Reynolds number! "[vortex shedding] occurs for any fluid, size, and speed, provided that Re between ~40 and 10^3". What! You can just say that in general? Magic! I haven't actually done enough utility problems to know that we'll find stuff like that but I trust dimensionless form.

Anyways, it seems that going on that date is what I ought to do. So did we need a concept of awfulness? Did it matter that all the options sucked? Nope; the decision was isomorphic in every way to choosing lunch between a BLT, a turkey club, and a handful of dirt.

There are some assumptions in that lunch bit, and it's worth discussing. It seems counterintuitive or even wrong, to say that your decision-process faced with lunch should be the same as when faced with a decision in involving torture, rape, and paperclips. The latter seems somehow more important. Where does that come from? Is it right?

This may deserve a bigger discussion, but basically, if you have finite resources (thought-power, money, energy, stress) that are conserved or even related across decisions, you get coupling of "different" decisions in a way that we didn't have here. Your intuitions are calibrated for that case. Once you have decoupled the decision by coming up with the actual candidate options. The depths-of-hell decision and the lunch decision really are totally isomorphic. I'll probably address this properly later, if I discuss instrumental utility of resources.

Anyways, once you put the problem in dimensionless form, a lot of decisions that seemed very different become almost the same, and a lot of details that seemed important or confusing just disappear. Bask in the clarifying power of a good abstraction.

Utility is Personal

So far we haven't touched the issue of interpersonal utility. That's because that topic isn't actually about VNM utility! There was nothing in the axioms above about there being a utility for each {person, outcome} pair, only for each outcome.

It turns out that if you try to compare utilities between agents, you have to touch unshielded utilities, which means you get radiation poisoning and go to type-theory hell. Don't try it.

And yet, it seems like we ought to care about what others prefer, and not just our own self-interest. But it seems like that inside the utility function, in moral philosophy, not out here in decision theory.

VNM has nothing to say on the issue of utilitarianism besides the usual preference-uncertainty interaction constraints, because VNM is about the preferences of a single agent. If that single agent cares about the preferences of other agents, that goes inside the utility function.

Conversely, because VNM utility is out here, axiomized for the sovereign preferences of a single agent, we don't much expect it to show up in there, in a discussion if utilitarian preference aggregation. In fact, if we do encounter it in there, it's probably a sign of a failed abstraction.

Living with Utility

Let's go back to how much work utility does as a concept. I've spent the last few sections hammering on the work that utility does not do, so you may ask "It's nice that utility theory can constrain our bets a bit, but do I really have to define my utility function by pinning down the relative utilities of every single possible outcome?".

Sort of. You can take shortcuts. We can, for example, wonder all at once whether, for all possible worlds where such is possible, you are indifferent between saving n lives and {50%: saving 2*n; 50%: saving 0}.

If that seems reasonable and doesn't break in any case you can think of, you might keep it around as heuristic in your ad-hoc utility function. But then maybe you find a counterexample where you don't actually prefer the implications of such a rule. So you have to refine it a bit to respond to this new argument. This is OK; the math doesn't want you to do things you don't want to.

So you can save a lot of small thought experiments by doing the right big ones, like above, but the more sweeping of a generalization you make, the more probable it is that it contains an error. In fact, conceptspace is pretty huge, so trying to construct a utility function without inside information is going to take a while no matter how you approach it. Something like disassembling the algorithms that produce your intuitions would be much more efficient, but that's probably beyond science right now.

In any case, in the current term before we figure out how to formally reason the whole thing out in advance, we have to get by with some good heuristics and our current intuitions with a pinch of last minute sanity checking against the VNM rules. Ugly, but better than nothing.

The whole project is made quite a bit harder in that we are not just trying to reconstruct an explicit utility function from revealed preference; we are trying to construct a utility function for a system that doesn't even currently have consistent preferences.

At some point, either the concept of utility isn't really improving our decisions, or it will come in conflict with our intuitive preferences. In some cases it's obvious how to resolve the conflict, in others, not so much.

But if VNM contradicts our current preferences, why do we think it's a good idea at all? Surely it's not wise to be tampering with our very values?

The reason we like VNM is that we have a strong meta-intuition that our preferences ought to be internally consistent, and VNM seems to be the only way to satisfy that. But it's good to remember that this is just another intuition, to be weighed against the rest. Are we ironing out garbage inconsistencies, or losing valuable information?

At this point I'm dangerously out of my depth. As far as I can tell, the great project of moral philosophy is an adult problem, not suited for mere mortals like me. Besides, I've rambled long enough.

Conclusions

What a slog! Let's review:

  • Maximize expected utility, where utility is just an encoding of your preferences that ensures a sane reaction to uncertainty.

  • Don't try to do anything else with utilities, or demons may fly out of your nose. This especially includes looking at the sign or magnitude, and comparing between agents. I call these things "numerology" or "interacting with an unshielded utility".

  • The default for utilities is that utilities are monolithic and inseparable from the entire outcome they are associated with. It takes special structure in your utility function to be able to talk about the marginal utility of something independently of particular outcomes.

  • We have to use the difference-and-ratio ritual to summon the utilities into the real numbers. Record utilities using explicit units and datum, and use dimensionless form for your calculations, which will make many things much clearer and more robust.

  • If you use a VNM basis, you don't need a concept of awfulness, just awesomeness.

  • If you want to do philosophy about the shape of your utility function, make sure you phrase it in terms of lotteries, because that's what utility is about.

  • The desire to use VNM is just another moral intuition in the great project of moral philosophy. It is conceivable that you will have to throw it out if it causes too much trouble.

  • VNM says nothing about your utility function. Consequentialism, hedonism, utilitarianism, etc are up to you.

Best of Rationality Quotes, 2012 Edition

31 DanielVarga 26 January 2013 03:03AM

I finished creating the 2012 edition of the Best of Rationality Quotes collection. (Here is last year's.)

Best of Rationality Quotes 2012 (500kB page, 434 quotes)
and Best of Rationality Quotes 2009-2012 (1200kB page, 1140 quotes)

The page was built by a short script (source code here) from all the LW Rationality Quotes threads so far. (We had such a thread each month since April 2009.) The script collects all comments with karma score 10 or more, and sorts them by score. Replies are not collected, only top-level comments.

As is now usual, I provide various statistics and top-lists based on the data. (Source code for these is also at the above link, see the README.) I added these as comments to the post:

Morality is Awesome

86 [deleted] 06 January 2013 03:21PM

(This is a semi-serious introduction to the metaethics sequence. You may find it useful, but don't take it too seriously.)

Meditate on this: A wizard has turned you into a whale. Is this awesome?

Is it?

"Maybe? I guess it would be pretty cool to be a whale for a day. But only if I can turn back, and if I stay human inside and so on. Also, that's not a whale.

"Actually, a whale seems kind of specific, and I'd be suprised if that was the best thing the wizard can do. Can I have something else? Eternal happiness maybe?"

Meditate on this: A wizard has turned you into orgasmium, doomed to spend the rest of eternity experiencing pure happiness. Is this awesome?

...

"Kindof... That's pretty lame actually. On second thought I'd rather be the whale; at least that way I could explore the ocean for a while.

"Let's try again. Wizard: maximize awesomeness."

Meditate on this: A wizard has turned himself into a superintelligent god, and is squeezing as much awesomeness out of the universe as it could possibly support. This may include whales and starships and parties and jupiter brains and friendship, but only if they are awesome enough. Is this awesome?

...

"Well, yes, that is awesome."


What we just did there is called Applied Ethics. Applied ethics is about what is awesome and what is not. Parties with all your friends inside superintelligent starship-whales are awesome. ~666 children dying of hunger every hour is not.

(There is also normative ethics, which is about how to decide if something is awesome, and metaethics, which is about something or other that I can't quite figure out. I'll tell you right now that those terms are not on the exam.)

"Wait a minute!" you cry, "What is this awesomeness stuff? I thought ethics was about what is good and right."

I'm glad you asked. I think "awesomeness" is what we should be talking about when we talk about morality. Why do I think this?

  1. "Awesome" is not a philosophical landmine. If someone encounters the word "right", all sorts of bad philosophy and connotations send them spinning off into the void. "Awesome", on the other hand, has no philosophical respectability, hence no philosophical baggage.

  2. "Awesome" is vague enough to capture all your moral intuition by the well-known mechanisms behind fake utility functions, and meaningless enough that this is no problem. If you think "happiness" is the stuff, you might get confused and try to maximize actual happiness. If you think awesomeness is the stuff, it is much harder to screw it up.

  3. If you do manage to actually implement "awesomeness" as a maximization criteria, the results will be actually good. That is, "awesome" already refers to the same things "good" is supposed to refer to.

  4. "Awesome" does not refer to anything else. You think you can just redefine words, but you can't, and this causes all sorts of trouble for people who overload "happiness", "utility", etc.

  5. You already know that you know how to compute "Awesomeness", and it doesn't feel like it has a mysterious essence that you need to study to discover. Instead it brings to mind concrete things like starship-whale math-parties and not-starving children, which is what we want anyways. You are already enabled to take joy in the merely awesome.

  6. "Awesome" is implicitly consequentialist. "Is this awesome?" engages you to think of the value of a possible world, as opposed to "Is this right?" which engages to to think of virtues and rules. (Those things can be awesome sometimes, though.)

I find that the above is true about me, and is nearly all I need to know about morality. It handily inoculates against the usual confusions, and sets me in the right direction to make my life and the world more awesome. It may work for you too.

I would append the additional facts that if you wrote it out, the dynamic procedure to compute awesomeness would be hellishly complex, and that right now, it is only implicitly encoded in human brains, and no where else. Also, if the great procedure to compute awesomeness is not preserved, the future will not be awesome. Period.

Also, it's important to note that what you think of as awesome can be changed by considering things from different angles and being exposed to different arguments. That is, the procedure to compute awesomeness is dynamic and created already in motion.

If we still insist on being confused, or if we're just curious, or if we need to actually build a wizard to turn the universe into an awesome place (though we can leave that to the experts), then we can see the metaethics sequence for the full argument, details, and finer points. I think the best post (and the one to read if only one) is joy in the merely good.

How To Have Things Correctly

57 Alicorn 17 October 2012 06:10AM

I think people who are not made happier by having things either have the wrong things, or have them incorrectly.  Here is how I get the most out of my stuff.

Money doesn't buy happiness.  If you want to try throwing money at the problem anyway, you should buy experiences like vacations or services, rather than purchasing objects.  If you have to buy objects, they should be absolute and not positional goods; positional goods just put you on a treadmill and you're never going to catch up.

Supposedly.

I think getting value out of spending money, owning objects, and having positional goods are all three of them skills, that people often don't have naturally but can develop.  I'm going to focus mostly on the middle skill: how to have things correctly1.

continue reading »

My Algorithm for Beating Procrastination

81 lukeprog 10 February 2012 02:48AM

Part of the sequence: The Science of Winning at Life

After three months of practice, I now use a single algorithm to beat procrastination most of the times I face it.1 It probably won't work for you quite like it did for me, but it's the best advice on motivation I've got, and it's a major reason I'm known for having the "gets shit done" property. There are reasons to hope that we can eventually break the chain of akrasia; maybe this post is one baby step in the right direction.

How to Beat Procrastination explained our best current general theory of procrastination, called "temporal motivation theory" (TMT). As an exercise in practical advice backed by deep theories, this post explains the process I use to beat procrastination — a process implied by TMT.

As a reminder, here's a rough sketch of how motivation works according to TMT:

the procrastination equation

Or, as Piers Steel summarizes:

Decrease the certainty or the size of a task's reward — its expectancy or its value — and you are unlikely to pursue its completion with any vigor. Increase the delay for the task's reward and our susceptibility to delay — impulsiveness — and motivation also dips.

Of course, my motivation system is more complex than that. P.J. Eby likens TMT (as a guide for beating procrastination) to the "fuel, air, ignition, and compression" plan for starting your car: it might be true, but a more useful theory would include details and mechanism.

That's a fair criticism. Just as an fMRI captures the "big picture" of brain function at low resolution, TMT captures the big picture of motivation. This big picture helps us see where we need to work at the gears-and-circuits level, so we can become the goal-directed consequentialists we'd like to be.

So, I'll share my four-step algorithm below, and tackle the gears-and-circuits level in later posts.

continue reading »

View more: Next