Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.
Related: Pinpointing Utility
Let's go for lunch at the Hypothetical Diner; I have something I want to discuss with you.
We will pick our lunch from the set of possible orders, and we will recieve a meal drawn from the set of possible meals,
Speaking in general, each possible order has an associated probability distribution over
O. The Hypothetical Diner takes care to simplify your analysis; the probability distribution is trivial; you always get exactly what you ordered.
Again to simplify your lunch, the Hypothetical Diner offers only two choices on the menu: the Soup, and the Bagel.
To then complicate things so that we have something to talk about, suppose there is some set
M of ways other things could be that may affect your preferences. Perhaps you have sore teeth on some days.
Suppose for the purposes of this hypothetical lunch date that you are VNM rational. Shocking, I know, but the hypothetical results are clear: you have a utility function,
U. The domain of the utility function is the product of all the variables that affect your preferences (which meal, and whether your teeth are sore):
U: M x O -> utility.
In our case, if your teeth are sore, you prefer the soup, as it is less painful. If your teeth are not sore, you prefer the bagel, because it is tastier:
U(sore & soup) > U(sore & bagel) U(~sore & soup) < U(~sore & bagel)
Your global utility function can be partially applied to some m in M to get an "object-level" utility function
U_m: O -> utility. Note that the restrictions of U made in this way need not have any resemblance to each other; they are completely separate.
It is convenient to think about and define these restricted "utility function patches" separately. Let's pick some units and datums so we can get concrete numbers for our utilities:
U_sore(soup) = 1 ; U_sore(bagel) = 0 U_unsore(soup) = 0 ; U_unsore(bagel) = 1
Those are separate utility functions, now, so we could pick units and datum seperately. Because of this, the sore numbers are totally incommensurable to the unsore numbers. *Don't try to comapre them between the UF's or you will get type-poisoning. The actual numbers are just a straightforward encoding of the preferences mentioned above.
What if we are unsure about where we fall in M? That is, we won't know whether our teeth are sore until we take the first bite. That is, we have a probability distribution over M. Maybe we are 70% sure that your teeth won't hurt you today. What should you order?
Well, it's usually a good idea to maximize expected utility:
EU(soup) = 30%*U(sore&soup) + 70%*U(~sore&soup) = ??? EU(bagel) = 30%*U(sore&bagel) + 70%*U(~sore&bagel) = ???
Suddenly we need those utility function patches to be commensuarable, so that we can actually compute these, but we went and defined them separately, darn. All is not lost though, recall that they are just restrictions of a global utility function to particular soreness-circumstance, with some (positive) linear transforms,
f_m, thrown in to make the numbers nice:
f_sore(U(sore&soup)) = 1 ; f_sore(U(sore&bagel)) = 0 f_unsore(U(~sore&soup)) = 0 ; f_unsore(U(~sore&bagel)) = 1
At this point, it's just a bit of clever function-inverting and all is dandy. We can pick some linear transform
g to be canonical, and transform all the utility function patches into that basis. So for all m, we can get g(U(m & o)) by inverting the
f_m and then applying
g.U(sore & x) = (g.inv(f_sore).f_sore)(U(sore & x)) = k_sore*U_sore(x) + c_sore g.U(~sore & x) = (g.inv(f_unsore).f_unsore)(U(~sore & x)) = k_unsore*U_unsore(x) + c_unsore
. to represent composition of those transforms. I hope that's not too confusing.)
Linear transforms are really nice; all the inverting and composing collapses down to a scale
k and an offset
c for each utility function patch. Now we've turned our bag of utility function patches into a utility function quilt! One more bit of math before we get back to deciding what to eat:
EU(x) = P(sore) *(k_sore *U_sore(x) + c_sore) + (1-P(sore))*(k_unsore*U_unsore(x) + c_unsore)
Notice that the terms involving
c_m do not involve
x, meaning that the
c_m terms don't affect our decision, so we can cancel them out and forget they ever existed! This is only true because I've implicitly assumed that P(m) does not depend on our actions. If it did, like if we could go to the dentist or take some painkillers, then it would be
P(m | x) and
c_m would be relevent in the whole joint decision.
We can define the canonical utility basis
g to be whatever we like (among positive linear transforms); for example, we can make it equal to
f_sore so that we can at least keep the simple numbers from
U_sore. Then we throw all the
c_ms away, because they don't matter. Then it's just a matter of getting the remaining
Ok, sorry, those last few paragraphs were rather abstract. Back to lunch. We just need to define these mysterious scaling constants and then we can order lunch. There is only one left;
k_unsore. In general there will be
n is the size of
M. I think the easiest way to approach this is to let
k_unsore = 1/5 and see what that implies:
g.U(sore & soup) = 1 ; g.U(sore & bagel) = 0 g.U(~sore & soup) = 0 ; g.U(~sore & bagel) = 1/5 EU(soup) = (1-P(~sore))*1 = 0.3 EU(bagel) = P(~sore)*k_unsore = 0.14 EU(soup) > EU(bagel)
After all the arithmetic, it looks like if
k_unsore = 1/5, even if we expect you to have nonsore teeth with
P(sore) = 0.3, we are unsure enough and the relative importance is big enough that we should play safe safe and go with the soup anyways. In general we would choose soup if
P(~sore) < 1/(k_unsore+1), or equivalently, if
k_unsore < (1-P(~sore)/P(~sore).
k is somehow the relative importance of possible preference stuctures under uncertainty. A smaller
k in this lunch example means that the tastiness of a bagel over a soup is small relative to the pain saved by eating the soup instead. With this intuition, we can see that
1/5 is a somewhat reasonable value for this scenario, and for example,
1 would not be, and neither would
What if we are uncertain about
k? Are we simply pushing the problem up some meta-chain? It turns out that no, we are not. Because
k is linearly related to utility, you can simply use its expected value if it is uncertain.
It's kind of ugly to have these
k_m's and these
U_m's, so we can just reason over the product
K x M instead of
K seperately. This is nothing weird, it just means we have more utility function patches (Many of which encode the exact same object-level preferences).
In the most general case, the utility function patches in
KxM are the space of all functions
O -> RR, with offset equivalence, but not scale equivalence (Sovereign utility functions have full linear-transform equivalence, but these patches are only equivalent under offset). Remember, though, that these are just restricted patches of a single global utility function.
So what is the point of all this? Are we just playing in the VNM sandbox, or is this result actually interesting for anything besides sore teeth?
Perhaps Moral/Preference Uncertainty? I didn't mention it until now because it's easier to think about lunch than a philosophical minefield, but it is the point of this post. Sorry about that. Let's conclude with everything restated in terms of moral uncertainty.
If we have:
A set of object-level outcomes
A set of "epiphenomenal" (outside of
O) 'moral' outcomes
A probability distribution over
M, possibly correlated with uncertainty about
O, but not in a way that allows our actions to influence uncertainty over
M(that is, assuming moral facts cannot be changed by your actions.),
A utility function over
Ofor each possible value of
M, (these can be arbitrary VNM-rational moral theories, as long as they share the same object-level),
And we wish to be VNM rational over whatever uncertainty we have
then we can quilt together a global utility function
U: (M,K,O) -> RR where and
U(m,k,o) = k*U_m(o) so that
EU(o) is the sum of all
P(m)*E(k | m)*U_m(o)
Somehow this all seems like legal VNM.
So. Just the possible object-level preferences and a probability distribution over those is not enough to define our behaviour. We need to know the scale for each so we know how to act when uncertain. This is analogous to the switch from ordinal preferences to interval preferences when dealing with object-level uncertainty.
Now we have a well-defined framework for reasoning about preference uncertainty, if all our possible moral theories are VNM rational, moral facts are immutable, and we have a joint probability distribution over
In particular, updating your moral beliefs upon hearing new arguments is no longer a mysterious dynamic, it is just a bayesian update over possible moral theories.
This requires a "moral prior" that corellates moral outcomes and their relative scales to the observable evidence. In the lunch example, we implicitly used such a moral prior to update on observable thought experiments and conclude that
1/5 was a plausible value for
Moral evidence is probably things like preference thought-experiments, neuroscience and physics results, etc. The actual model for this, and discussion about the issues with defining and reasoning on such a prior are outside the scope of this post.
This whole argument couldn't prove its way out of a wet paper bag, and is merely suggestive. Bits and peices may be found incorrect, and formalization might change things a bit.
This framework requires that we have already worked out the outcome-space
O (which we haven't), have limited our moral confusion to a set of VNM-rational moral theories over
O (which we haven't), and have defined a "Moral Prior" so we can have a probability distribution over moral theories and their wieghts (which we haven't).
Nonetheless, we can sometimes get those things in special limited cases, and even in the general case, having a model for moral uncertainty and updating is a huge step up from the terrifying confusion I (and everyone I've talked to) had before working this out.
Keith Stanovich is a leading expert on the cogsci of rationality, but he also also written on a problem related to CEV, that of the "rational integration" of our preferences. Here he is on pages 81-86 of Rationality and the Reflective Mind (currently my single favorite book on rationality, out of the dozens I've read):
All multiple-process models of mind capture a phenomenal aspect of human decision making that is of profound importance — that humans often feel alienated from their choices. We display what folk psychology and philosophers term weakness of will. For example, we continue to smoke when we know that it is a harmful habit. We order a sweet after a large meal, merely an hour after pledging to ourselves that we would not. In fact, we display alienation from our responses even in situations that do not involve weakness of will — we find ourselves recoiling from the sight of a disfigured person even after a lifetime of dedication to diversity and inclusion.
This feeling of alienation — although emotionally discomfiting when it occurs — is actually a reflection of a unique aspect of human cognition: the use of Type 2 metarepresentational abilities to enable a cognitive critique of our beliefs and our desires. Beliefs about how well we are forming beliefs become possible because of such metarepresentation, as does the ability to evaluate one's desires — to desire to desire differently...
...There is a philosophical literature on the notion of higher-order evaluation of desires... For example, in a classic paper on second-order desires, Frankfurt (1971) speculated that only humans have such metarepresentational states. He evocatively termed creatures without second-order desires (other animals, human babies) wantons... A wanton simply does not reflect on his/her goals. Wantons want — but they do not care what they want.
Nonwantons, however, can represent a model of an idealized preference structure — perhaps, for example, a model based on a superordinate judgment of long-term lifespan considerations... So a human can say: I would prefer to prefer not to smoke. This second-order preference can then become a motivational competitor to the first-order preference. At the level of second-order preferences, I prefer to prefer to not smoke; nevertheless, as a first-order preference, I prefer to smoke. The resulting conflict signals that I lack what Nozick (1993) terms rational integration in my preference structures. Such a mismatched first-/second-order preference structure is one reason why humans are often less rational than bees in an axiomatic sense (see Stanovich 2004, pp. 243-247). This is because the struggle to achieve rational integration can destabilize first-order preferences in ways that make them more prone to the context effects that lead to the violation of the basic axioms of utility theory (see Lee, Amir, & Ariely 2009).
The struggle for rational integration is also what contributes to the feeling of alienation that people in the modern world often feel when contemplating the choices that they have made. People easily detect when their high-order preferences conflict with the choices actually made.
Of course, there is no limit to the hierarchy of higher-order desires that might be constructed. But the representational abilities of humans may set some limits — certainly three levels above seems a realistic limit for most people in the nonsocial domain (Dworking 1988). However, third-order judgments can be called upon to to help achieve rational integration at lower levels. So, for example, imagine that John is a smoker. He might realize the following when he probes his feelings: He prefers his preference to prefer not to smoke over his preference for smoking.
We might in this case say that John's third-order judgment has ratified his second-order evaluation. Presumably this ratification of his second-order judgment adds to the cognitive pressure to change the first-order preference by taking behavioral measures that will make change more likely (entering a smoking secession program, consulting his physician, staying out of smoky bars, etc.).
On the other hand, a third-order judgment might undermine the second-order preference by failing to ratify it: John might prefer to smoke more than he prefers his preference to prefer not to smoke.
In this case, although John wishes he did not want to smoke, the preference for this preference is not as strong as his preference for smoking itself. We might suspect that this third-order judgment might not only prevent John from taking strong behavioral steps to rid himself of his addiction, but that over time it might erode his conviction in his second-order preference itself, thus bringing rational integration to all three levels.
Typically, philosophers have tended to bias their analyses toward the highest level desire that is constructed — privileging the highest point in the regress of higher-order evaluations, using that as the foundation, and defining it as the true self. Modern cognitive science would suggest instead a Neurathian project in which no level of analysis is uniquely privileged. Philosopher Otto Neurath... employed the metaphor of a boat having some rotten planks. The best way to repair the planks would be to bring the boat ashore, stand on firm ground, and replace the planks. But what if the boat could not be brought ashore? Actually, the boat could still be repaired but at some risk. We could repair the planks at sea by standing on some of the planks while repairing others. The project could work — we could repair the boat without being on the firm foundation of ground. The Neurathian project is not guaranteed, however, because we might choose to stand on a rotten plank. For example, nothing in Frankfurt's (1971) notion of higher-order desires guarantees against higher-order judgments being infected by memes... that are personally damaging.
Preferences are important both for rationality and for Friendly AI, so preferences are a major topic of discussion on Less Wrong. We've discussed preferences in the context of economics and decision theory, but I think AI has a more robust set of tools for working with preferences than either economics or decision theory has, so I'd like to introduce Less Wrong to some of these tools. In particular, I think AI's toolset for working with preferences may help us think more clearly about CEV.
In AI, we can think of working with preferences in four steps:
- Preference acquisition: In this step, we aim to extract preferences from a user. This can occur either by preference learning or by preference elicitation. Preference learning occurs when preferences are acquired from data about the user's past behavior or past preferences. Preference elicitation occurs as a result of an interactive process with the user, e.g. a question-answer process.
- Preferences modeling: Our next step is to mathematically express these acquired preferences as preferences between pairwise choices. The properties of a preferences model are important. For example, is the relation transitive? (If the model tells us that choice c1 is preferred to c2, and c2 is preferred to c3, can we conclude that c1 is preferred to c3?) And is the relation complete? (Is any choice comparable to any other choice, or are there some incomparabilities?)
- Preference representation: Assuming we want to capture and manipulate the user's preferences robustly, we'll next want to represent the preferences model in a preference representation language.
- Preferences reasoning: Once a user's preferences are represented in a preference representation language, we can do cool things like preferences aggregation (involving the preferences of multiple agents) and preference revision (a user's new preferences being added to her old preferences). We can also perform the usual computations of decision theory, game theory, and more.
Preference learning is typically an application of supervised machine learning (classification). Throw the algorithm at a database containing a user's preferences, and it will learn that user's preferences and make predictions about the preferences not listed in the database, including preferences about pairwise choices the user may never have faced before.
Preference elicitation involves asking a user a series of questions, and extracting their preferences from the answers they give. Chen & Pu (2004) survey some of the methods used for this.
In studying CEV, I am interested in methods built for learning a user's utility function from inconsistent behavior (because humans make inconsistent choices). Nielsen & Jensen (2004) provided two computationally tractable algorithms which handle the problem by interpreting inconsistent behavior as random deviations from an underlying "true" utility function. As far as I know, however, nobody in AI has tried to solve the problem with an algorithm informed by the latest data from neuroeconomics on how human choice is the product of at least three valuation systems, only one of which looks anything like an "underlying true utility function."
A model of a user's preferences describes one of three relations between any two choices ("objects"): a strict preference relation which says that one choice is preferred to another, an indifference relation, and an incomparability relation. Kaci (2011), chapter 2 provides a brief account of preference modeling.
In decision theory, a preference relation is represented by a numerical function with associates a utility value with each choice. But this may not be the best representation. We face an exponential number of choices whose explicit enumeration and evaluation is time-consuming. Moreover, users can't compare all pairwise choices and evaluate how satisfactory each choice is.
Luckily, choices are often made on the basis of a set of attributes, e.g. cost, color, price, etc. You can use a preference representation language to represent partial descriptions of preferences and rank-order possible choices. The challenge of a preference representation language is that it should (1) cope with a user's preferences, (2) faithfully represent the user's preferences such that it rank-orders choices in a way similar to how the user would specify choices if they were able to provide preferences for every pairwise comparison, (3) cope with possibly inconsistent preferences, and (4) offer attractive complexity properties, i.e. the spatial cost of representing partial descriptions of preferences and the time cost of comparing pairwise choices or computing the best choices.
One popular method of preference representation is with the graphical representation language of conditional preference networks or "CP-nets." They look like this.
There are a multitude of ways in which one might want to reason algorithmically about preferences. I point the reader to Part II of Kaci (2011) for a very incomplete overview.
Domshlak et al. (2011). Preferences in AI: An Overview. Artificial Intelligence 175: 1037-1052.
Fürnkranz & Hüllermeier (2010). Preference Learning. Springer.
Kaci (2011). Working with Preferences: Less is More. Springer.
I've recently been thinking about future job prospects and ways that I might alter my preferences to increase the likelihood that I'll be happy with my future career. I have read some of the LessWrong resources about this issue, but they don't seem to address my particular concerns. I think there is a high relative importance for selecting a career with a high capacity for making me happy. It will consume at least 8 prime daylight hours of my work days and in many cases also some of the weekend. In all likelihood I will also be forced to sit in front of a computer for extended periods of time. The tasks I am assigned may have nothing to do with the things that I happen to find intellectually interesting or of ethical importance. And the work will likely zap me of most of the energy that I could use to pursue hobbies or other more "intrinsically worthwhile endeavors" (intrinsic to my personal preference ordering). Given that I believe these factors will largely determine whether I feel happy in many future situations and also whether I feel generically happy about the content of my life as a whole, I think it is worthwhile to seek advice from other rationalists in how to choose an appropriate career goal and take steps to pursue it.
What I have found on LessWrong, however, is that ambiguous and open-ended pleas for advice generally steer off course, even if the tangential issues are very interesting and insightful. Rather than query everyone for open advice about preference hacking, vague goal achievement, and wisdom for properly assigning value to some of the factors I have listed above, I propose a simpler informal job survey.
If you are interested, please briefly list the job you have or the job of someone you know very well (well enough that you feel you know relevant details about the job, details that may be hard to gather in less than 1 hour of internet searching). You don't have to reveal the location or name of the employer or anything like that, just the type of job. Optionally, please also include a sentence stating whether you (or your friend, etc.) seem to enjoy the job and why. For example, my entry would be like this:
I am a graduate student studying applied mathematics. I enjoy the access to educational resources and the flexible schedule that my current job offers, but I think my personal displeasure with computer programming and my perception that future jobs doing mathematical theory are scarce cause me to dislike the job overall.
If enough people are willing to participate, my hope is that the stream of small anecdotal remarks will serve as a brainstorming session. I hope to hear about jobs I may never have thought of, and also reasons for liking or disliking a job that I may never have thought of. The goal is to spark additional search on my own and also to gauge my current preferences in light of preferences that others have experienced with specific jobs. Such a survey would be a very helpful resource allowing me to synthesize data about job directions where the initial search will have a higher probability of being helpful for me.
I was inspired by the recent post discussing self-hacking for the purpose of changing a relationship perspective to achieve a goal. Despite my feeling inspired, though, I also felt like life hacking was not something I could ever want to do even if I perceived benefits to doing it. It seems to me that the place where I would need to begin is hacking myself in order to cause myself to want to be hacked. But then I started contemplating whether this is a plausible thing to do.
In my own case, there are two concrete examples in mind. I am a graduate student working on applied math and probability theory in the field of machine vision. I was one of those bright-eyes, bushy-tailed dolts as an undergrad who just sort of floated to grad school believing that as long as I worked sufficiently hard, it was a logical conclusion that I would get a tenure-track faculty position at a desirable university. Even though I am a fellowship award winner and I am working with a well-known researcher at an Ivy League school, my experience in grad school (along with some noted articles) has forced me to re-examine a lot of my priorities. Tenure-track positions are just too difficult to achieve and achieving them is based on networking, politics, and whether the popularity of your research happens to have a peak at the same time that your productivity in that area also has a peak.
But the alternatives that I see are: join the consulting/business/startup world, become a programmer/analyst for a large software/IT/computer company, work for a government research lab. I worked for two years at MIT's Lincoln Laboratory as a radar analyst and signal processing algorithm developer prior to grad school. The main reason I left that job was because I (foolishly) thought that graduate school was where someone goes to specifically learn the higher-level knowledge and skills to do theoretical work that transcends the software development / data processing work that is so common. I'm more interested in creating tools that go into the toolbox of an engineer than with actually using those tools to create something that people want to pay for.
I have been deeply thinking about these issues for more than two years now, almost every day. I read everything that I can and I try to be as blunt and to-the-point about it as I can be. Future career prospects seem bleak to me. Everyone is getting crushed by data right now. I was just talking with my adviser recently about how so much of the mathematical framework for studying vision over the last 30 years is just being flushed down the tubes because of the massive amount of data processing and large scale machine learning we can now tractably perform. If you want to build a cup-detector for example, you can do lots of fancy modeling, stochastic texture mapping, active contour models, fancy differential geometry, occlusion modeling, etc. Or.. you can just train an SVM on 50,000,000 weakly labeled images of cups you find on the internet. And that SVM will utterly crush the performance of the expert system based on 30 years of research from amazing mathematicians. And this crushing effect only stands to get much much worse and at an increasing pace.
In light of this, it seems to me that I should be learning as much as I can about large-scale data processing, GPU computing, advanced parallel architectures, and the gross details of implementing bleeding edge machine learning. But, currently, this is exactly the sort of thing I hate and went to graduate school to avoid. I wanted to study Total Variation minimization, or PDE-driven diffusion models in image processing, etc. And these are things that are completely crushed by large data processing.
So anyway, long story short: suppose that I really like "math theory and teaching at a respected research university" but I see the coming data steamroller and believe that this preference will cause me to feel unhappy in the future when many other preferences I have (and some I don't yet know about) are effected negatively by pursuit of a phantom tenure-track position. But suppose also that another preference I have is that I really hate "writing computer code to build widgets for customers" which can include large scale data analyses, and thus I feel an aversion to even trying to *want* to hack myself and orient myself to a more practical career goal.
How does one hack one's self to change one's preferences when the preference in question is "I don't want to hack myself?"
Some people see never-existed people as moral agents, and claim that we can talk about their preferences. Generally this means their personal preference in existing versus non-existing. Formulations such "it is better for someone to have existed than not" reflect this way of thinking.
But if the preferences of never-existed are relevant, then their non-personal perferences are also relevant. Do they perfer a blue world or a pink one? Would they want us to change our political systems? Would they want us to not bring into existence some never-existent people they don't like?
It seems that those who are advocating bringing never-existent people into being in order to satisfy those people's preferences should be focusing their attention on their non-personal preferences instead. After all, we can only bring into being so many trillions of trillions of trillions; but there is no theoretical limit to the number of never-existent people whose non-personal preferences we can satisfy. Just get some reasonable measure across the preferences of never-existent people, and see if there's anything that sticks out from the mass.