Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Applying utility functions to humans considered harmful

25 Post author: Kaj_Sotala 03 February 2010 07:22PM

There's a lot of discussion on this site that seems to be assuming (implicitly or explicitly) that it's meaningful to talk about the utility functions of individual humans. I would like to question this assumption.

To clarify: I don't question that you couldn't, in principle, model a human's preferences by building this insanely complex utility function. But there's an infinite amount of methods by which you could model a human's preferences. The question is which model is the most useful, and which models have the least underlying assumptions that will lead your intuitions astray.

Utility functions are a good model to use if we're talking about designing an AI. We want an AI to be predictable, to have stable preferences, and do what we want. It is also a good tool for building agents that are immune to Dutch book tricks. Utility functions are a bad model for beings that do not resemble these criteria.

To quote Van Gelder (1995):

Much of the work within the classical framework is mathematically elegant and provides a useful description of optimal reasoning strategies. As an account of the actual decisions people reach, however, classical utility theory is seriously flawed; human subjects typically deviate from its recommendations in a variety of ways. As a result, many theories incorporating variations on the classical core have been developed, typically relaxing certain of its standard assumptions, with varying degrees of success in matching actual human choice behavior.

Nevertheless, virtually all such theories remain subject to some further drawbacks:

(1) They do not incorporate any account of the underlying motivations that give rise to the utility that an object or outcome holds at a given time.
(2) They conceive of the utilities themselves as static values, and can offer no good account of how and why they might change over time, and why preferences are often inconsistent and inconstant.
(3) They offer no serious account of the deliberation process, with its attendant vacillations, inconsistencies, and distress; and they have nothing to say about the relationships that have been uncovered between time spent deliberating and the choices eventually made.

Curiously, these drawbacks appear to have a common theme; they all concern, one way or another, temporal aspects of decision making. It is worth asking whether they arise because of some deep structural feature inherent in the whole framework which conceptualizes decision-making behavior in terms of calculating expected utilities.

One model that attempts to capture actual human decision making better is called decision field theory. (I'm no expert on this theory, having encountered it two days ago, so I can't vouch for how good it actually is. Still, even if it's flawed, it's useful for getting us to think about human preferences in what seems to be a more realistic way.) Here's a brief summary of how it's constructed from traditional utility theory, based on Busemeyer & Townsend (1993). See the article for the mathematical details, closer justifications and different failures of classical rationality which the different stages explain.

Stage 1: Deterministic Subjective Expected Utility (SEU) theory. Basically classical utility theory. Suppose you can choose between two different alternatives, A and B. If you choose A, there is a payoff of 200 utilons with probability S1, and a payoff of -200 utilons with probability S2. If you choose B, the payoffs are -500 utilons with probability S1 and +500 utilons with probability S2. You'll choose A if the expected utility of A, S1 * 200 + S2 * -200 is higher than the expected utility of B, S1 * -500 + S2 * 500, and B otherwise.

Stage 2: Random SEU theory. In stage 1, we assumed that the probabilities S1 and S2 stay constant across many trials. Now, we assume that sometimes the decision maker might focus on S1, producing a preference for action A. On other trials, the decision maker might focus on S2, producing a preference for action B. According to random SEU theory, the attention weight for variable Si is a continous random variable, which can change from trial to trial because of attentional fluctuations. Thus, the SEU for each action is also a random variable, called the valence of an action. Deterministic SEU is a special case of random SEU, one where the trial-by-trial fluctuation of valence is zero.

Stage 3: Sequential SEU theory. In stage 2, we assumed that one's decision was based on just one sample of a valence difference on any trial. Now, we allow a sequence of one or more samples to be accumulated during the deliberation period of a trial. The attention of the decision maker shifts between different anticipated payoffs, accumulating weight to the different actions. Once the weight of one of the actions reaches some critical threshold, that action is chosen. Random SEU theory is a special case of sequential SEU theory, where the amount of trials is one.

Consider a scenario where you're trying to make a very difficult, but very important decisions. In that case, your inhibitory threshold for any of the actions is very high, so you spend a lot of time considering the different consequences of the decision before finally arriving to the (hopefully) correct decision. For less important decisions, your inhibitory threshold is much lower, so you pick one of the choices without giving it too much thought.

Stage 4: Random Walk SEU theory. In stage 3, we assumed that we begin to consider each decision from a neutral point, without any of the actions being the preferred one. Now, we allow prior knowledge or experiences to bias the initial state. The decision maker may recall previous preference states, that are influenced in the direction of the mean difference. Sequential SEU theory is a special case of random walk theory, where the initial bias is zero.

Under this model, decisions favoring the status quo tend to be chosen more frequently under a short time limit (low threshold), but a superior decision is more likely to be chosen as the threshold grows. Also, if previous outcomes have already biased decision A very strongly over B, then the mean time to choose A will be short while the mean time to choose B will be long.

Stage 5: Linear System SEU theory. In stage 4, we assumed that previous experiences all contribute equally. Now, we allow the impact of a valence difference to vary depending on whether it occurred early or late (a primacy or recency effect). Each previous experience is given a weight given by a growth-decay rate parameter. Random walk SEU theory is a special case of linear system SEU theory, where the growth-decay rate is set to zero.

Stage 6: Approach-Avoidance Theory. In stage 5, we assumed that, for example, the average amount of attention given to the payoff (+500) only depended on event S2. Now, we allow the average weight to be affected by a another variable, called the goal gradient. The basic idea is that the attractiveness of a reward or the aversiveness of a punishment is a decreasing function of distance from the point of commitment to an action. If there is little or no possibility of taking an action, its consequences are ignored; as the possibility of taking an action increases, the attention to its consequences increases as well. Linear system theory is a special case of approach-avoidance theory, where the goal gradient parameter is zero.

There are two different goal gradients, one for gains and rewards and one for losses or punishments. Empirical research suggests that the gradient for rewards tends to be flatter than that for punishments. One of the original features of approach-avoidance theory was the distinction between rewards versus punishments, closely corresponding to the distinction of positively versus negatively framed outcomes made by more recent decision theorists.

Stage 7: Decision Field Theory. In stage 6, we assumed that the time taken to process each sampling is the same. Now, we allow this to change by introducing into the theory a time unit h, representing the amount of time it takes to retrieve and process one pair of anticipated consequences before shifting attention to another pair of consequences. If h is allowed to approach zero in the limit, the preference state evolves in an approximately continous manner over time. Approach-avoidance is a spe... you get the picture.

 


 

Now, you could argue that all of the steps above are just artifacts of being a bounded agent without enough computational resources to calculate all the utilities precisely. And you'd be right. And maybe it's meaningful to talk about the "utility function of humanity" as the outcome that occurs when a CEV-like entity calculated what we'd decide if we could collapse Decision Field Theory back into Deterministic SEU Theory. Or maybe you just say that all of this is low-level mechanical stuff that gets included in the "probability of outcome" computation of classical decision theory. But which approach do you think gives us more useful conceptual tools in talking about modern-day humans?

You'll also note that even DFT (or at least the version of it summarized in a 1993 article) assumes that the payoffs themselves do not change over time. Attentional considerations might lead us to attach a low value to some outcome, but if we were to actually end up in that outcome, we'd always value it the same amount. This we know to be untrue. There's probably some even better way of looking at human decision making, one which I suspect might be very different from classical decision theory.

So be extra careful when you try to apply the concept of a utility function to human beings.

Comments (114)

Comment author: Kaj_Sotala 03 February 2010 07:25:27PM 1 point [-]

Question: do people think this post was too long? In the beginning, I thought that it would be a good idea to give a rough overview of DFT to give an idea of some of the ways by which pure utility functions could be made more reflective of actual human behavior. Near the end, though, I was starting to wonder if it would've been better to just sum it up in, say, three paragraphs.

Comment author: Nick_Tarleton 03 February 2010 07:42:42PM *  2 points [-]

I do think that it's longer than necessary, and that the central point as stated in the title is far more important than the details of the seven theories. Still, I wish I could upvote it more than once, since that central point is really important. (Or at least it really annoys me when people talk as if humans did have utility functions.)

Comment author: djcb 03 February 2010 09:33:19PM -1 points [-]

Agreed, but I'd say that people do have a utility function -- it's just that it may be so complex that it's better seen as a kind of metaphor than as a mathematical construct you can actual do something with.

I share your annoyance -- there seems to be a bias among some to use maths-derived language where it is not very helpful.

Comment author: RichardKennaway 05 February 2010 12:02:11AM -2 points [-]

If utility isn't a mathematical construct you can do something with, then it's an empty concept.

Comment author: djcb 06 February 2010 02:24:54PM 0 points [-]

You might still be able to determine a manageable utility function for a lower animal. For humans it's simply too complex -- at least in 2010, just like the function that predicts next week's weather.

Comment author: RichardKennaway 06 February 2010 06:50:13PM 0 points [-]

You might still be able to determine a manageable utility function for a lower animal.

I will believe this only when I see it done.

I do not expect to see it done, no matter how low the animal.

Comment author: Splat 03 February 2010 07:56:00PM 1 point [-]

I found the detail helpful. Even more detail might have been good, but you'd have had to write a sequence.

Comment author: Dagon 03 February 2010 09:30:42PM 1 point [-]

I found it a bit long. I wish you'd done both: a short description followed by more detail.

Comment author: Bo102010 04 February 2010 02:40:39AM 0 points [-]

Not too long. The buildup between the theories was key in keeping my attention.

Comment author: cousin_it 03 February 2010 07:50:48PM *  2 points [-]

There are many alternatives to expected utility if you want to model actual humans. For example, Kahneman and Tversky's prospect theory. The Wikipedia page for Expected utility hypothesis contains many useful links.

Comment author: pjeby 03 February 2010 07:50:49PM 1 point [-]

Curiously, these drawbacks appear to have a common theme; they all concern, one way or another, temporal aspects of decision making.

Ainslie and Powers are certainly two who've taken up this question; Ainslie from the perspective of discounted prediction, and Powers from the perspective of correcting time-averaged perceptions.

I think both are required to fully understand human decisionmaking. Powers fills in the gap of Ainslie's vague notion of "appetites", while Ainslie fills in for the lack of any sort of foresight or prediction in Powers' model.

IOW, I think human beings derive "motivation to act" ("appetite" in Ainslie's terms) from the difference between the current value and the reference value of a time-averaged measurement (per Powers), but choose which action to take, based on a hyperbolically-discounted prediction of how their actions will affect the variable whose value is being adjusted (per Ainslie).

This combination of two non-timeless ways of measuring "utility" seems to much better describe what humans actually do.

Comment author: RichardKennaway 04 February 2010 11:57:35PM -1 points [-]

Ainslie and Powers are certainly two who've taken up this question; Ainslie from the perspective of discounted prediction, and Powers from the perspective of correcting time-averaged perceptions.

Presumably this Ainslie. But if Powers is William (PCT) Powers then I don't know what you're referring to by "correcting time-averaged perceptions".

Comment author: Eliezer_Yudkowsky 03 February 2010 08:52:06PM 6 points [-]

Research like this seems very hopeful to me. It breaks down into a nice component describing what people actually want and a lot of other components describing shifts of attention and noise. If anything, that seems too optimistic compared to, say, prospect theory, in which the basic units of motivation are shifts from a baseline and there's no objective baseline or obvious way to translate shift valuations into fixed-level valuations.

Comment author: mattnewport 04 February 2010 07:40:06AM 1 point [-]

I'm a little surprised you haven't commented on the randomization aspects of this model. As you've convincingly argued, if your intention is accurate prediction then you can't improve your results by introducing randomness into your model. This model claims to improve its accuracy by introducing randomness in steps 2 and 4 which is a claim I am highly suspicious of after reading your sequence on the topic.

Comment author: prase 04 February 2010 08:38:25AM *  1 point [-]

Depends on what you want to predict. I throw dice and have a model which says that number 5 is the result, deterministically. Now I will be right in 1/6 cases. If I am rewarded for each correct guess, then by introducing randomness into the model I will gain nothing - this is what Eliezer was arguing for. But if I am rewarded for correctly predicting the distribution of results after many throws, any random model is clearly superior to the five-only one.

Comment author: mattnewport 04 February 2010 08:43:12AM 0 points [-]

The random model is better than the five-only one but a non-random model that directly predicts the distribution would be better still. If your goal is to predict the distribution then a model that does so by simulating random dice throws is inferior to one that simply predicts the distribution.

Comment author: prase 04 February 2010 08:51:52AM *  0 points [-]

And if you want to do both, i.e. predict both the individual throws and the overall distribution? The "model" which directly states that the distribution is uniform doesn't say anything about the individual events. Of course we can have model which says that the sequence will be e.g. 1 4 2 5 6 3 2 5 1 6 4 3 and then repeated, or that the sequence will follow the decimal expansion of pi. Both these models predict the distribution correctly, but they seem to be more complex than the random one and moreover they can produce false predictions of correlations (like 5 is always preceded by 2 in the first case).

Or do I misunderstand you somehow?

Comment author: mattnewport 04 February 2010 05:07:38PM 0 points [-]

A model that uses a sequence is simpler than one that uses a random number, as anyone who has implemented a pseudo random number generator will tell you. PRNGs are generally either simple or good, rarely both.

Comment author: prase 05 February 2010 12:31:13PM *  2 points [-]

Depends on what hardware you have got. Having a computer with access to some quantum system (decaying nuclei, spin measurement in orthogonal directions) there is no need to specify in a complicated way the meaning of "random". Or, of course, there is no need for the randomness to be "fundamental", whatever it means. You can as well throw dice (though it would be a bit circular to use dice to explain dice, but it seems all right to use dice as the random generator for making predictions in economy).

Comment author: mattnewport 05 February 2010 05:23:41PM 0 points [-]

A hardware random number generator isn't part of an algorithm, it's an input to an algorithm. You can't argue that your model is algorithmically simpler by replacing part of the algorithm with a new input.

Comment author: prase 07 February 2010 07:27:19PM *  0 points [-]

So, should quantum mechanics be modified by removing the randomness from it?

Now, having a two level spin system in state ( |0> + |1> ) /sqrt[2], QM says that the result of measurement is random and so we'll find the particle in state |1> with probability 1/2.

A modified QM would say, that the first measurement reveals 1, the second (after recreating the original initial state, of course) 1, the third 0, etc., with sequence 110010010110100010101010010101011110010101...

I understand that you say that the second version of quantum mechanics would be simpler, and disagree.

Comment author: Kaj_Sotala 04 February 2010 10:33:16AM *  5 points [-]

The model doesn't incorporate randomness in the sense of saying "to predict the behavior of humans, roll a dice and predict behavior X on a result of 1-3 and predict behavior Y on a result of 4-6", which is what Eliezer was objecting against. Instead, it says there is randomness involved in the subjects it's modeling, and says the behavior of the subjects can be best modeled using a certain (deterministically derived) probability distribution.

Comment author: mattnewport 04 February 2010 04:57:06PM 0 points [-]

Instead, it says there is randomness involved in the subjects it's modeling

Does it say that? I didn't get the impression they were making that claim. It seems higly likely to be false if they are. They model changes in attentional focus as a random variable but presumably those changes in attention are driven largely by complex events in the brain responding to complex features of the environment, not by random quantum fluctuation. They are using a random variable because the actual process is too complex too model and they have no simple better idea for how to model it than pure randomness.

Comment author: Kaj_Sotala 04 February 2010 05:27:33PM 0 points [-]

Well, yes, "so complex and chaotic that you might as well call it random" is what I meant. That's what's usually meant by the term - the results of dice rolls aren't mainly driven by quantum randomness either.

Comment author: mattnewport 04 February 2010 06:24:12PM 2 points [-]

Complex yes, chaotic I doubt. I'm reasonably confident that there is some kind of meaningful pattern to attentional shifts that is correlated with features of the environment and that is adaptive to improve outcomes in our evolutionary environment. Randomness in this model reflects a lack of sufficient information about the environment or the process that drives attention rather than a belief that attention shifts do not have a meaningful correlation with the environment.

Comment author: mattnewport 03 February 2010 09:42:17PM 7 points [-]

A model is not terribly useful if it does not do a better job of prediction than alternative models. (Micro)economics does quite a good job of predicting human behaviour based on a very simple model of predictable rationality. It is not clear to me that this model offers a better approach to making meaningful predictions about real world human behaviour. I've only skimmed the article but it appears the tests are limited to rather artificial lab tests. That's better than nothing but I'm skeptical that this model's real world predictive power justifies its complexity.

The 'true' utility function of any particular human is no doubt an intractable beast of a computation but don't be too quick to dismiss the value of assuming a much simpler utility function and assuming that people do a reasonably good job of (boundedly) optimizing for it.

Comment author: bgrah449 03 February 2010 09:54:51PM 1 point [-]

I think it's useful inasmuch as it turns "unknown unknowns" into "known unknowns." Knowing what you're ignoring in your approximation seems valuable.

Comment author: mattnewport 03 February 2010 10:29:17PM 1 point [-]

I think they are claiming that their model more closely matches observed behaviour in certain specific controlled environments. It is a big leap from there to assume that the features of the model map in any useful way to actual features of human reasoning.

Comment author: Kaj_Sotala 04 February 2010 10:40:23AM 0 points [-]

Yes, I admit that it can sometimes be useful to think of humans as having utility functions, and this can be a useful model. I should have said that in the post, now that you mention it. But then one should then always keep in mind that that's just a simplified model that's appropriate for certain situations, not something that can be indiscriminately employed in every case.

Comment author: timtyler 03 February 2010 10:28:04PM *  -1 points [-]

It seems simple to convert any computable agent-based input-transform-output model into a utility-based model - provided you are allowed utility functions with Turing complete languages.

Simply wrap the I/O of the non-utility model, and then assign the (possibly compound) action the agent will actually take in each timestep utility 1 and assign all other actions a utility 0 - and then take the highest utility action in each timestep.

That neatly converts almost any practical agent model into a utility-based model.

So: there is nothing "wrong" with utility-based models. A good job too - they are economics 101.

Comment author: whpearson 03 February 2010 11:22:06PM 1 point [-]

You get plenty of absurdities following this route. Like atoms are utility maximising agents that want to follow brownian motion and are optimal!

Comment author: Mitchell_Porter 04 February 2010 01:07:57AM 1 point [-]

Or they want to move in straight lines forever but are suboptimal.

Comment author: timtyler 04 February 2010 09:16:58AM *  1 point [-]

You mean like the principle of least action...? ...or like the maximum entropy principle...?

Comment author: Cyan 04 February 2010 03:12:45PM 6 points [-]

Slapping the label "utility" on any quantity optimized in any situation adds zero content.

Comment author: timtyler 04 February 2010 08:40:52PM *  0 points [-]

It is not supposed to. "Utility" in such contexts just means "that which is optimized". It is terminology.

"That which is optimized" is a mouthful - "utility" is shorter.

Comment author: Cyan 04 February 2010 09:04:19PM *  1 point [-]

There's already a word for that: "optimand". The latter is the better terminology because (i) science-y types familiar with the "-and" suffix will instantly understand it and (ii) it's not in a name collision with another concept.

If "utility" is just terminology for "that which is optimized", then

It is this simplicity that makes the utility-based framework such an excellent general purpose model of goal-directed agents

is vacuous: goal-directed agents attempt to optimize something by definition.

Comment author: timtyler 04 February 2010 09:13:12PM *  0 points [-]

There's already a word for that: "optimand".

Right - but you can't say "expected optimand maximiser". There is a loooong history of using the term "utility" in this context in economics. Think you have better terminology? Go for it - but so far, I don't see much of a case.

Comment author: Cyan 04 February 2010 09:17:48PM *  0 points [-]

That would be the "other concept" (link edited to point to specific subsection of linked article) referred to in the grandparent.

Comment author: timtyler 04 February 2010 09:14:18PM *  -1 points [-]

Not "vacuous" - true. We have people saying that utility-based frameworks are "harmful". That needs correcting, is all.

Comment author: Cyan 04 February 2010 09:20:00PM *  0 points [-]

I suspect that by "utility-based frameworks" they mean something more specific than you do.

Comment author: timtyler 04 February 2010 09:45:15PM -1 points [-]

Maybe - but if suspicions are all you have, then someone is not being clear - and I don't think it is me.

Comment author: Cyan 04 February 2010 10:00:53PM *  2 points [-]

I find it hilarious that you think you're being perfectly clear and yet cannot be bothered to employ standard terminology.

Comment author: Johnicholas 03 February 2010 11:29:15PM 1 point [-]

Is this an argument in favor of using utility functions to model agents, or against?

Comment author: timtyler 04 February 2010 09:23:54AM 0 points [-]

It is just saying that you can do it - without much in the way of fuss or mess - contrary to the thesis of this post.

Comment author: Kaj_Sotala 04 February 2010 10:19:54AM 0 points [-]

Did you miss the second paragraph of the post?

To clarify: I don't question that you couldn't, in principle, model a human's preferences by building this insanely complex utility function. But there's an infinite amount of methods by which you could model a human's preferences. The question is which model is the most useful, and which models have the least underlying assumptions that will lead your intuitions astray.

Comment author: timtyler 04 February 2010 10:25:13AM *  -1 points [-]

Did you miss the second paragraph of the post?

No, I didn't. My construction shows that the utility function need not be "insanely complex". Instead, a utility based model can be constructed that is only slightly more complex than the simplest possible model.

It is partly this simplicity that makes the utility-based framework such an excellent general purpose model of goal-directed agents - including, of course, humans.

Comment author: Kaj_Sotala 04 February 2010 10:45:28AM 1 point [-]

Wait, do you mean that your construction is simply acting as a wrapper on some underlying model, and converting the outputs of that model into a different format?

If that's what you mean, then well, sure. You could do that without noticeably increasing the complexity. But in that case the utility wrapping doesn't really give us any useful additional information, and it'd still be the underlying model we'd be mainly interested in.

Comment author: timtyler 04 February 2010 08:51:54PM *  -2 points [-]

The outputs from the utility based model would be the same as from the model it was derived from - a bunch of actuator/motor outputs. The difference would be the utility-maximizing action "under the hood".

Utility based models are most useful when applying general theorems - or comparing across architectures. For example when comparing the utility function of a human with that of a machine intelligence - or considering the "robustness" of the utility function to environmental perturbations.

If you don't need a general-purpose model, then sure - use a specific one, if it suits your purposes.

Please don't "bash" utility-based models, though. They are great! Bashers simply don't appreciate their virtues. There are a lot of utility bashers out there. They make a lot of noise - and AFAICS, it is all pointless and vacuous hot air.

My hypothesis is that they think that their brain being a mechanism-like expected utility maximiser somehow diminishes their awe and majesty. It's the same thing that makes people believe in souls - just one step removed.

Comment author: timtyler 04 February 2010 09:01:04PM *  -1 points [-]

Incidentally, I do not like writing "utility-based model" over and over again. These models should be called "utilitarian". We should hijack that term away from the ridiculous and useless definition used by the ethicists. They don't have the rights to this term.

Comment author: Kaj_Sotala 05 February 2010 07:48:23PM 0 points [-]

I don't think I understand what you're trying to describe here. Could you give an example of a scenario where you usefully transform a model into a utility-based one the way you describe?

I'm not bashing utility-based models, I'm quite aware of their good sides. I'm just saying they shouldn't be used universally and without criticism. That's not bashing any more than it's bashing to say that integrals aren't the most natural way to do matrix multiplication with.

Comment author: timtyler 05 February 2010 08:08:25PM *  0 points [-]

Could you give an example of a scenario where you usefully transform a model into a utility-based one the way you describe?

Call the original model M.

"Wrap" the model M - by preprocessing its sensory inputs and post-processing its motor outputs.

Then, post-process M's motor outputs - by enumerating its possible actions at each moment, assign utility 1 to the action corresponding to the action M output, and assign utility 0 to all other actions.

Then output the action with the highest utility.

I'm not bashing utility-based models, I'm quite aware of their good sides.

Check with your subject line. There are plenty of good reasons for applying utility functions to humans. A rather obvious one is figuring out your own utility function - in order to clarify your goals to yourself.

Comment author: Kaj_Sotala 05 February 2010 08:37:03PM 0 points [-]

Okay, I'm with you so far. But what I was actually asking for was an example of a scenario where this wrapping gives us some benefit that we wouldn't have otherwise.

I don't think utility functions are a very good tool to use when seeking to clarify one's goals to yourself. Things like PJ Eby's writings have given me rather powerful insights to my goals, content which would be pointless to try to convert to the utility function framework.

Comment author: Jonathan_Graehl 03 February 2010 11:32:24PM 3 points [-]

I don't think that's the right wrapping.

Utilities are over outcomes, not decisions.

Decisions change the distribution of outcomes but rarely force a single absolutely predictable outcome. At the very least, your outcome is contingent on other actors' unpredictable effects.

Maybe you have some way of handling this in your wrapping; it's not clear to me.

This reminds me: often it seems like people think they can negotiate outcomes by combining personal utility functions in some way. Your quirky utility function is just one example of how it's actually in general impossible to do so without normalizing and weighting in some fair way the components of each person's claimed utility.

Comment author: timtyler 04 February 2010 09:22:46AM *  0 points [-]

Utilities are over outcomes, not decisions.

Utilities are typically scalars calculated from sensory inputs and memories - which are the sum total of everything the agent knows at the time.

Each utility is associated with one of the agent's possible actions at each moment.

The outcome is that the agent performs the "best" action (according to the utility function) - and then the rest of the world responds to it according to physical law. The agent can only control its actions. Outcomes are determined from them by physics and the rest of the world.

Decisions change the distribution of outcomes but rarely force a single absolutely predictable outcome. At the very least, your outcome is contingent on other actors' unpredictable effects.

...but an agent only takes one action at any moment (if you enumerate its possible actions appropriately). So this is a non-issue from the perspective of constructing a utility-based "wrapper".

Comment author: Jonathan_Graehl 04 February 2010 11:18:08PM *  0 points [-]

I personally feel happy or sad about the present state of affairs, including expectation of future events ("Oh no, my parachute won't deploy! I sure am going to hit the ground fast."). I can call how satisfied I am with the current state of things as I perceive it "utility". Of course, by using that word, it's usually assumed that my preferences obey some axioms, e.g. von Neumann-Morgenstern, which I doubt your wrapping satisfies in any meaningful way.

Perhaps there's some retrospective sense in which I'd talk about the true utility of the actual situation at the time (in hindsight I have a more accurate understanding of how things really were and what the consequences for me would be), but as for my current assessment it is indeed entirely a function of my present mental state (including perceptions and beliefs about the state of the universe salient to me). I think we agree on that.

I'm still not entirely sure I understand the wrapping you described. It feels like it's too simple to be used for anything.

Perhaps it's this: given the life story of some individual (call her Ray), you can vacuously (in hindsight) model her decisions with the following story:

1) Ray always acts so that the immediately resulting state of things has the highest expected utility. Ray can be thought of as moving through time and having a utility at each time, which must include some factor for her expectation of her future e.g. health, wealth, etc.

2) Ray is very stupid and forms some arbitrary belief about the result of her actions, expecting with 100% confidence that the predicted future of her life will come to pass. Her expectation in the next moment will usually turn out to revise many things she previously wrongly expected with certainty, i.e. she's not actually predicting the future exactly.

3) Whatever Ray believed the outcome would be at each choice, she assigned utility 1. To all other possibilities she assigned utility 0.

That's the sort of fully-described scenario that your proposal evoked in me. If you want to explain how she's forecasting more than singleton expectation set, and yet the expected utility for each decision she takes magically works out to be 1, I'd enjoy that.

In other words, I don't see any point modeling intelligent yet not omniscient+deterministic decision making unless the utility at a given state includes an anticipation of expectation of future states.

Comment author: timtyler 05 February 2010 07:22:19AM *  0 points [-]

Of course, by using that word, it's usually assumed that my preferences obey some axioms, e.g. von Neumann-Morgenstern, which I doubt your wrapping satisfies in any meaningful way.

I certanly did not intend any such implication. Which set of axioms is using the word "utility" supposed to imply?

Perhaps check with the definition of "utility". It means something like "goodness" or "value". There isn't an obvious implication of any specific set of axioms.

Comment author: timtyler 05 February 2010 07:23:44AM *  0 points [-]

In other words, I don't see any point modeling intelligent yet not omniscient+deterministic decision making unless the utility at a given state includes an anticipation of expectation of future states.

There's no point in discussing "utility maximisers" - rather than "expected utility maximisers"?

I don't really agree - "utility maximisers" is a simple generalisation of the concept of "expected utility maximiser". Since there are very many ways of predicting the future, this seems like a useful abstraction to me.

...anyway, if you were wrapping a model a human, the actions would clearly be based on predictions of future events. If you mean you want the prediction process to be abstracted out in the wrapper, obviously there is no easy way to do that.

You could claim that a human - while a "utility maximiser" was not clearly an "expected utility maximiser". My wrapper doesn't disprove such a claim. I generally think that the "expected utility maximiser" claim is highly appropriate for a human as well - but there is not such a neat demonstration of this.

Comment author: RichardKennaway 04 February 2010 11:46:12PM -1 points [-]

The outcome is that the agent performs the "best" action (according to the utility function) - and then the rest of the world responds to it according to physical law. The agent can only control its actions. Outcomes are determined from them by physics and the rest of the world.

This is backwards. Agents control their perceptions, not their actions. They vary their actions in such a manner as to produce the perceptions they desire. There is a causal path from action to perception outside the agent, and another from perception (and desired perception) to action inside the agent.

It is only by mistakenly looking at those paths separately and ignoring their connection that one can maintain the stimulus-response model of an organism (whether of the behaviourist or cognitive type), whereby perceptions control actions. But the two are bound together in a loop, whose properties are completely different: actions control perceptions. The loop as a whole operates in such a way that the perception takes on whatever value the agent intends it to. The action varies all over the place, while the perception hardly changes. The agent controls its perceptions by means of its actions; the environment does not control the agent's actions by means of the perceptions it supplies.

Comment author: Cyan 05 February 2010 12:17:53AM *  2 points [-]

The agent can only control its actions.

Agents control their perceptions, not their actions.

"Control" is being used in two different senses in the above two quotes. In control theory parlance, timtyler is saying that actions are the manipulated variable, and you're saying that perceptions are the process variable.

Comment author: timtyler 05 February 2010 07:18:15AM *  2 points [-]

"This is backwards. Agents control their perceptions, not their actions."

Um. Agents do control their actions.

I am well aware of the perception-action feedback - but what does it have to do with this discussion?

Comment author: RichardKennaway 05 February 2010 06:33:02PM -1 points [-]

I am well aware of the perception-action feedback - but what does it have to do with this discussion?

It renders wrong the passage that I quoted above. You have described agents as choosing an outcome (from utility calculations, which I'd dispute, but that's not the point at issue here) deciding on an action which will produce that outcome, and emitting that action, whereupon the world then produces the chosen outcome. Agents, that is, in the grip of the planning fallacy.

Planning plays a fairly limited role in human activity. An artificial agent designed to plan everything will do nothing useful. "No plan of battle survives contact with the enemy." "What you do changes who you are." "Life is what happens when you're making other plans." Etc.

Comment author: timtyler 05 February 2010 06:47:40PM *  0 points [-]

I don't know what you are thinking - but it seems fairly probable that you are still misinterpreting me - since your first paragraph contains:

You have described agents as choosing an outcome [...] deciding on an action which will produce that outcome, and emitting that action

...which appears to me to have rather little to do with what I originally wrote.

Rather, agents pick an action to execute, enumerate their possible actions, have a utility (1 or 0) assigned to each action by the I/O wrapper I described, select the highest utility action and then pass that on to the associated actuators.

Notice the lack of mention of outcomes here - in contrast to your description.

I stand by the passage that you quoted above, which you claim is wrong.

Comment author: RichardKennaway 05 February 2010 09:42:59PM 0 points [-]

In that case, I disagree even more. The perceived outcome is what matters to an agent. The actions it takes to get there have no utility attached to them; if utility is involved, it attaches to the perceived outcomes.

I continue to be perplexed that you take seriously the epiphenomal utility function you described in these words:

Simply wrap the I/O of the non-utility model, and then assign the (possibly compound) action the agent will actually take in each timestep utility 1 and assign all other actions a utility 0 - and then take the highest utility action in each timestep.

and previously here. These functions require you to know what action the agent will take in order to assign it a utility. The agent is not using the utility to choose its action. The utility function plays no role in the agent's decision process.

Comment author: timtyler 05 February 2010 09:59:58PM *  0 points [-]

The utility function plays no role in the agent's decision process.

The utility function determines what the agent does. It is the agent's utility function.

Utilities are numbers. They are associated with actions - that association is what allows utility-based agents to choose between their possible actions.

The actions produces outcomes - so, the utilities are also associated with the relevant outcomes.

Comment author: RichardKennaway 04 February 2010 11:38:38PM 0 points [-]

This does not work. The trivial assignment of 1 to what happens and 0 to what does not happen is not a model of anything. A real utility model would enable you to evaluate the utility of various actions in order to predict which one will be performed. Your fake utility model requires you to know the action that was taken in order to evaluate its utility. It enables no predictions. It is not a model at all.

Comment author: timtyler 05 February 2010 07:15:49AM -3 points [-]

No. You just didn't understand it. Perhaps re-read.

Comment author: Matt_Simpson 03 February 2010 11:01:23PM 3 points [-]

Are you questioning that we can model human behavior using a utility function (i.e. microeconomics) or that we can model human values using a utility function? Or both? The former is important if you're trying to predict what a human would do, the second is important if you're trying to figure out what humans should do - or what you want an AGI to do.

Comment author: Kaj_Sotala 04 February 2010 10:30:14AM 0 points [-]

I was mainly thinking about values, but behavior is suspect as well. (Though I gather that some of the use of utility functions for modeling human behavior has been relatively successful in economics.)

Comment author: Matt_Simpson 04 February 2010 03:56:28PM *  4 points [-]

I spent a minute trying to think of a reply arguing for utility functions as models of human values, but I think thats wrong. I'm really agnostic about the type of preference structure human values have, and I think I'm going to stop saying "utility function" and start saying "preferences" or the more awkward "something like a utility function" to indicate this agnosticism.

When it comes to econ, utility theory is clearly a false model of human behavior (how many models aren't false?), but it's simplicity is appealing. As mattnewport alludes to, alternative theories usually don't improve predictions enough in order to be worth the substantial increase in complexity they typically entail. At least that's my impression.

Comment author: thomblake 04 February 2010 04:09:02PM 3 points [-]

how many models aren't false

I'm wondering how a model can be "false". It seems like simply "bad" would be more appropriate.

Perhaps if the model gets you less accurate results than some naive model, or guessing.

I've been thinking a lot lately of treating ethical theories as models... I might have to write a paper on this, including some unpacking of "model". Perhaps I'll start with some top-level posts.

Comment author: bgrah449 04 February 2010 04:39:30PM 3 points [-]
Comment author: Matt_Simpson 04 February 2010 05:13:13PM *  5 points [-]

By a false model, all I mean is a model that isn't exactly the same as the reality it's supposed to model. It's probably a useless notion (except for maybe in theoretical physics?), but some people see textbook econ and think "people aren't rational, therefore textbook economics is wrong, therefore my favorite public policy will work." The last step isn't always there or just a single step, but it's typically the end result. I've gotten into the habit of making the "all models are false" point when discussing economic models just to combat this mindset.

In general, it distresses me that so few people understand that scientists create maps, not exact replicas of the territory.

Treating ethical theories as models seems so natural now that you mention it. We have some preference structure that know very little about. What should we do? The same thing we did with all sorts of phenomenon that we knew very little about - model it!

Comment author: Kaj_Sotala 04 February 2010 05:57:48PM *  0 points [-]

I've been thinking a lot lately of treating ethical theories as models...

Any relation to my thoughts of ethical theories as models?

http://lesswrong.com/lw/18l/ethics_as_a_black_box_function/

http://lesswrong.com/lw/18l/ethics_as_a_black_box_function/14ha

Comment author: thomblake 04 February 2010 06:05:37PM 0 points [-]

Sure.

The three-tier way of looking at it is interesting, but I'll definitely be approaching it from the perspective of someone taking a theoretical approach to the study of ethics. The end result, hopefully, will be something written for such people.

Comment author: Jonathan_Graehl 03 February 2010 11:38:11PM 1 point [-]

What's the risk in using a more static view of utility or preference in computing CEV?

My initial thought: fine, some people will be less pleased at various points in the future than they would have been. But a single dominant FAI effectively determining our future is already a compromise from what people would most prefer.

Comment author: Johnicholas 03 February 2010 11:42:23PM 7 points [-]

There's a gap between the general applicability of utility functions in theory, and their general inapplicability in practice. Indeed, there's a general gap between theory and practice.

I would argue that this gap is a reason to do FAI research in a practical way - writing code, building devices, performing experiments. Dismissing gritty practicality as "too risky" or "not relevant yet" (which is what I hear SIAI doing) seems to lead to becoming a group without experience and skill at executing practical tasks.

Disclaimer: I'm aware that many FAI enthusiasts fall into the "Striving with all my hacker strength to build a self-improving friendly AI is FAI research, right?" error. That's NOT what I'm advocating.

Comment author: loqi 04 February 2010 08:38:01AM -1 points [-]

Any such "experiments" that allow for effective outbound communication from a proto-AI seem unacceptably risky. I'm curious what you think of the "oh crap, what if it's right?" scenario I commented on over on the AI box post.

Comment author: Johnicholas 04 February 2010 10:44:37PM 1 point [-]

I didn't SAY try to build a self-improving AI! That's what the disclaimer was for!

Also, your claim of "unacceptably risky" needs actual arguments and reasoning to support it. As I see it, the only choice that is clearly unacceptably risky is inaction. Carefully confining your existential risk reduction activity to raising awareness about potential AI risks isn't in any sense safe- for example, it could easily cause more new uFAI projects than it prevents.

Comment author: loqi 05 February 2010 04:12:00AM *  1 point [-]

Raising awareness about the problem isn't just about getting would-be uFAI'ers to mend their sinful ways, you know. It's absolutely necessary if you're convinced you need help with it. As you said, inaction is untenable. If you're certain that a goal of this magnitude is basically impossible given the status quo, taking some initial risks is a trivial decision. It doesn't follow that additional risks share the same justification.

I'm also not convinced we understand the boundaries between "intelligent" and "self-improving" well enough to assume we can experiment with one and not the other. What sort of "practical tasks" do you have in mind that don't involve potentially intelligent information-processing systems, and why do you think they'll be at all relevant to the "real" work ahead?

Comment author: Nick_Tarleton 04 February 2010 11:12:33PM 4 points [-]

What sort of code, devices, experiments do you have in mind?

Comment author: Johnicholas 05 February 2010 12:02:38PM 2 points [-]

MBlume's article "Put It To The Test" is pretty much what I have in mind.

If you think you understand a decision theory, can you write a test suite for an implementation of it? Can your test suite pass a standard implementation, and fail mutations of that standard implementation? Can you implement it? Is the performance of your implementation within a factor of ten-thousand of the standard implementation? Is it competitive? Can you improve the state of the art?

If you believe that the safe way to write code is to spend a long time in front of whiteboards, getting the design right, and then only a very short time developing (using a few high-IQ programmers) - How many times have you built projects according to this development process? What is your safety record? How does it compare to other development processes?

If you believe that writing machine-checkable proofs about code is important - Can you download and install one of the many tools (e.g. Coq) for writing proofs about code? Can you prove anything correct? What projects have you proved correct? What is their safety record?

What opportunities have you given reality to throw wrenches into your ideas - how carefully have you looked for those wrenches?

Comment author: RichardKennaway 04 February 2010 11:24:14PM 0 points [-]

Utility functions are a good model to use if we're talking about designing an AI. We want an AI to be predictable, to have stable preferences, and do what we want.

Why would these desirable features be the result? It reads to me as if you're saying that this is a solution to the Friendly AI problem. Surely not?

Comment author: PhilGoetz 08 October 2011 05:01:36PM 0 points [-]

I am afraid he probably does. That's the Yudkowskian notion of "friendly". Not a very good word to describe it, IMHO.

Comment author: Qiaochu_Yuan 29 November 2012 01:50:01AM 6 points [-]

When I read the beginning of this post I asked myself, "if people don't have utility functions, why haven't LWers gotten rich by constructing Dutch books against people?"

I answered myself, "in practice, most people will probably ignore clever-looking bets because they'll suspect that they're being tricked. One way to avoid Dutch books is to avoid bets in general."