Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

The Human's Hidden Utility Function (Maybe)

44 Post author: lukeprog 23 January 2012 07:39PM

Suppose it turned out that humans violate the axioms of VNM rationality (and therefore don't act like they have utility functions) because there are three valuation systems in the brain that make conflicting valuations, and all three systems contribute to choice. And suppose that upon reflection we would clearly reject the outputs of two of these systems, whereas the third system looks something more like a utility function we might be able to use in CEV.

What I just described is part of the leading theory of choice in the human brain.

Recall that human choices are made when certain populations of neurons encode expected subjective value (in their firing rates) for each option in the choice set, with the final choice being made by an argmax or reservation price mechanism.

Today's news is that our best current theory of human choices says that at least three different systems compute "values" that are then fed into the final choice circuit:

  • The model-based system "uses experience in the environment to learn a model of the transition distribution, outcomes and motivationally-sensitive utilities." (See Sutton & Barto 1998 for the meanings of these terms in reinforcement learning theory.) The model-based system also "infers choices by... building and evaluating the search decision tree to work out the optimal course of action." In short, the model-based system is responsible for goal-directed behavior. However, making all choices with a goal-directed system using something like a utility function would be computationally prohibitive (Daw et al. 2005), so many animals (including humans) first evolved much simpler methods for calculating the subjective values of options (see below).

  • The model-free system also learns a model of the transition distribution and outcomes from experience, but "it does so by caching and then recalling the results of experience rather than building and searching the tree of possibilities. Thus, the model-free controller does not even represent the outcomes... that underlie the utilities, and is therefore not in any position to change the estimate of its values if the motivational state changes. Consider, for instance, the case that after a subject has been taught to press a lever to get some cheese, the cheese is poisoned, so it is no longer worth eating. The model-free system would learn the utility of pressing the lever, but would not have the informational wherewithal to realize that this utility had changed when the cheese had been poisoned. Thus it would continue to insist upon pressing the lever. This is an example of motivational insensitivity."

  • The Pavlovian system, in contrast, calculates values based on a set of hard-wired preparatory and consummatory "preferences." Rather than calculate value based on what is likely to lead to rewarding and punishing outcomes, the Pavlovian system calculates values consistent with automatic approach toward appetitive stimuli, and automatic withdrawal from aversive stimuli. Thus, "animals cannot help but approach (rather than run away from) a source of food, even if the experimenter has cruelly arranged things in a looking-glass world so that the approach appears to make the food recede, whereas retreating would make the food more accessible (Hershberger 1986)."

Or, as Jandila put it:

  • Model-based system: Figure out what's going on, and what actions maximize returns, and do them.
  • Model-free system: Do the thingy that worked before again!
  • Pavlovian system: Avoid the unpleasant thing and go to the pleasant thing. Repeat as necessary.

In short:

We have described three systems that are involved in making choices. Even in the case that they share a single, Platonic, utility function for outcomes, the choices they express can be quite different. The model-based controller comes closest to being Platonically appropriate... The choices of the model-free controller can depart from current utilities because it has learned or cached a set of values that may no longer be correct. Pavlovian choices, though determined over the course of evolution to be appropriate, can turn out to be instrumentally catastrophic in any given experimental domain...

[Having multiple systems that calculate value] is [one way] of addressing the complexities mentioned, but can lead to clashes between Platonic utility and choice. Further, model-free and Pavlovian choices can themselves be inconsistent with their own utilities.

We don't yet know how choice results from the inputs of these three systems, nor how the systems might interact before they deliver their value calculations to the final choice circuit, nor whether the model-based system really uses anything like a coherent utility function. But it looks like the human might have a "hidden" utility function that would reveal itself if it wasn't also using the computationally cheaper model-free and Pavlovian systems to help determine choice.

At a glance, it seems that upon reflection I might embrace an extrapolation of the model-based system's preferences as representing "my values," and I would reject the outputs of the model-free and Pavlovian systems as the outputs of dumb systems that evolved for their computational simplicity, and can be seen as ways of trying to approximate the full power of a model-based system responsible for goal-directed behavior.

On the other hand, as Eliezer points out, perhaps we ought to be suspicious of this, because "it sounds like the correct answer ought to be to just keep the part with the coherent utility function in CEV which would make it way easier, but then someone's going to jump up and say: 'Ha ha! Love and friendship were actually in the other two!'"

Unfortunately, it's too early to tell whether these results will be useful for CEV. But it's a little promising. This is the kind of thing that sometimes happens when you hack away at the edges of hard problems. This is also a repeat of the lesson that "you can often out-pace most philosophers simply by reading what today's leading scientists have to say about a given topic instead of reading what philosophers say about it."

(For pointers to the relevant experimental data, and for an explanation of the mathematical role of each valuation system in the brain's reinforcement learning system, see Dayan (2011). All quotes in this post are from that chapter, except for the last one.)

Comments (87)

Comment author: Yvain 24 January 2012 06:03:05PM *  23 points [-]

This is also a repeat of the lesson that "you can often out-pace most philosophers simply by reading what today's leading scientists have to say about a given topic instead of reading what philosophers say about it."

On the other hand, rationality can be faster than science. And I'm feeling pretty good about positing three different forms of motivation, divided between model-free tendencies based on conditioning, and model-based goals, then saying we could use transhumanism to focus on the higher-level rational ones, without having read the particular neuroscience you're citing...

...actually, wait. I read as much of the linked paper as I could (Google Books hides quite a few pages) and I didn't really see any strong neuroscientific evidence. It looked like they were inferring the existence of the three systems from psychology and human behavior, and then throwing in a bit of neuroscience by mentioning some standard results like the cells that represent error in reinforcement learning. What I didn't see was a description of how three separate systems naturally fall out of brain studies. But I missed a lot of the paper - is there anything like that in there?

Comment author: lukeprog 25 January 2012 09:50:35PM 8 points [-]

What I didn't see was a description of how three separate systems naturally fall out of brain studies. But I missed a lot of the paper - is there anything like that in there?

Some, yes. I've now updated the link in the OP so it points to a PDF of the full chapter.

Comment author: Nick_Beckstead 07 February 2012 05:35:23PM 9 points [-]

What's the evidence that this is the "leading theory of choice in the human brain"? (I am not saying I have evidence that it isn't, but it's important for this post that some large relevant section of the scientific community thinks this theory is awesome.)

Comment author: Vladimir_Nesov 23 January 2012 08:58:34PM *  8 points [-]

Suppose it turned out that humans violate the axioms of VNM rationality (and therefore don't act like they have utility functions) because there are three valuation systems in the brain that make conflicting valuations

Humans violate any given set of axioms simply because they are not formally flawless, so such explanations only start being relevant when discussing an idealization, in this case a descriptive one. But properties of descriptive idealizations don't easily translate into properties of normative idealizations.

Comment author: Alicorn 23 January 2012 07:59:13PM 7 points [-]

The quoted summaries of each of the three systems are confusing and I don't feel like I have an understanding of them, except insofar as the word "Pavlovian" gives a hint. Can you translate more clearly, please?

Comment author: [deleted] 23 January 2012 08:11:25PM 29 points [-]

Or, to put it more simply:

  1. Figure out what's going on, and what actions maximize returns, and do them.
  2. Do the thingy that worked before again!
  3. Avoid the unpleasant thing and go to the pleasant thing. Repeat as necessary.
Comment author: shminux 23 January 2012 09:22:15PM 5 points [-]

Maybe give Luke a lesson or two on C^3 (clear, concise and catchy) summaries.

Comment author: lukeprog 23 January 2012 10:05:34PM 0 points [-]

Note that I wrote this post in two hours flat and made little attempt to optimize presentation in this case.

Comment author: shminux 23 January 2012 10:58:36PM *  10 points [-]

Sorry, I did not intend my comment to rub you the wrong way (or any of my previous comments that might have). FWIW, I think that you are doing a lot of good stuff for the SIAI, probably most of it invisible to an ordinary forum regular. I realize that you cannot afford spending extra two hours per post on polishing the message. Hopefully one of the many skills of your soon-to-be-hired executive assistant will be that of "optimizing presentation".

Comment author: lukeprog 23 January 2012 10:59:51PM 5 points [-]

No worries!

Comment author: MACHISMO 26 January 2012 09:49:01PM 4 points [-]

Indeed. Much invisible work is required before optimization can occur. Invisible forging of skills precedes their demonstration.

Comment author: lukeprog 26 January 2012 05:56:18PM 3 points [-]
Comment author: TheOtherDave 26 January 2012 07:01:43PM 1 point [-]

It might be an interesting exercise to record predictions in a hidden-but-reliable form about karma of posts six months out, by way of calibrating one's sense of how well-received those posts will be to their target community.

Comment author: Swimmer963 23 January 2012 11:29:08PM 0 points [-]

It's still better than the posts I write in 2 hours! Did that 2 hours include the time spent researching, or were you just citing sources you'd already read for other reasons? In either case...not bad.

Comment author: lukeprog 23 January 2012 08:25:30PM 5 points [-]

Added to the original post, credit given.

Comment author: JoachimSchipper 24 January 2012 12:11:02PM 4 points [-]

Could you put it before the hard-to-parse explanations? It was nice to confirm my understanding, but it would have saved me a minute or two of effort if you'd put those first.

Comment author: Yvain 24 January 2012 04:02:27AM *  2 points [-]

Is 2 operant/Skinnerian conditioning, and 3 classical/Pavolvian conditioning?

Comment author: [deleted] 24 January 2012 07:44:46AM 3 points [-]

If by "is" you mean "Do these correspond the underlying cognitive antecedents used in...", then my answer is "it would seem so."

Comment author: [deleted] 23 January 2012 08:08:39PM 6 points [-]

The first one incorporates information about past experiences into simplified models of the world, and then uses the models to steer decisions through search-space based upon a sort of back-of-the-envelope, hazy calculation of expected value. It's a utility function, basically, as implemented by brain.

The second one also incorporates information about past experiences, but rather than constructing the dataset into a model and performing searches over it, it derives expectations directly from what's remembered, and is insensitive to things like probability or shifting subjecting values.

The third one is sort of like the first in its basic operations (incorporate information, analyze it, make models) -- but instead of calculating expected values, it aims to satisfy various inbuilt "drives", and sorts paths through search space based upon approach/avoid criteria linked to those drives.

Comment author: lukeprog 23 January 2012 08:19:18PM *  0 points [-]

I like Jandila's explanations.

Comment author: BrianNachbar 27 January 2012 03:04:40PM 5 points [-]

Where do the model-based system's terminal goals come from?

Comment author: Eliezer_Yudkowsky 23 January 2012 10:32:28PM 12 points [-]

Um, objection, I didn't actually say that and I would count the difference as pretty significant here. I said, "I would be suspicious of that for the inverse reason my brain wants to say 'but there has to be a different way to stop the train' in the trolley problem - it sounds like the correct answer ought to be to just keep the part with the coherent utility function in CEV which would make it way easier, but then someone's going to jump up and say: 'Ha ha! Love and friendship were actually in the other two!'"

Comment author: lukeprog 23 January 2012 10:42:52PM *  6 points [-]

What? You said that? Sorry, I didn't mean to misquote you so badly. I'll blame party distractions or something. Do you remember the line about a gift basket and it possibly making CEV easier?

Anyway, I'll edit the OP immediately to remove the misquote.

For reference, the original opening to this post was:

Me: "Suppose it turned out that humans violate the axioms of VNM rationality (and therefore don't act like they have utility functions) because there are three valuation systems in the brain that make conflicting valuations, and all three systems contribute to choice. And suppose that upon reflection we would clearly reject the outputs of two of these systems, whereas the third system looks something more like a utility function. How would you feel?"

Eliezer: "I would feel like someone had left an enormous gift basket at my front door. That could make CEV easier."

Me: "Okay, well, what I just described is part of the leading theory of choice in the human brain."

Comment author: cousin_it 23 January 2012 09:44:19PM *  11 points [-]

Congratulations on continuing this line of inquiry!

One thing that worries me is that it seems to focus on the "wanting" part to the exclusion of the "liking" part, so we may end up in a world we desire today but won't enjoy tomorrow. In particular, I suspect that a world built according to our publicly stated preferences (which is what many people seem to think when they hear "reflective equilibrium") won't be very fun to live in. That might happen if we get much of our fun from instinctive and Pavlovian actions rather than planned actions, which seems likely to be true for at least some people. What do you think about that?

Comment author: lukeprog 23 January 2012 10:02:16PM 9 points [-]

I think that upon reflection, we would desire that our minds be designed in such a way that we get pleasure from getting the things we want, or pleasure whenever we want, or something — instead of how the system is currently set up, where we can't always choose when we feel good and we only sometimes feel good as a result of getting what we want.

Comment author: Multiheaded 24 January 2012 05:34:35PM 0 points [-]

Yeah, I agree. I said that we should, in principle, rewire ourselves for this very reason in Bakkot's (in)famous introduction thread, but Konkvistador replied he's got reasons to be suspicious and fearful about such an undertaking.

Comment author: Bugmaster 25 January 2012 04:28:02AM 3 points [-]

This might be a silly question, but still:

Are the three models actually running on three different sets of wetware within the brain, or are they merely a convenient abstraction of human behavior ?

Comment author: BrianNachbar 27 January 2012 07:39:32PM 0 points [-]

I think what matters is whether they're concurrent—which it sounds like they are. Basically, whether they're more or less simultaneous and independent. If you were emulating a brain on a computer, they could all be on one CPU, or on different ones, and I don't think anyone would suggest that the em on the single CPU should get a different CEV than an identical one on multiple CPUs.

Comment author: Bugmaster 27 January 2012 07:56:36PM 3 points [-]

I was really more interested in whether or not we can observe these models running independently in real, currently living humans (or chimps or rats, really). This way, we could gather some concrete evidence in favor of this three-model approach; and we could also directly measure how strongly the three models are weighted relative to each other.

Comment author: MaoShan 16 February 2012 03:24:50AM -1 points [-]

If you could reduce the cognitive cost of the model-based system by designing a "decision-making app", you could directly test if it was beneficial and actually (subjectively or otherwise) improved the subject's lives. If it was successful, you'd have a good chance of beta-testing a real CEV.

Comment author: jimmy 24 January 2012 12:35:29AM *  3 points [-]

I'm skeptical of any clear divide between the systems. Of course, there are more abstract and more primitive information paths, but they talk to each other, and I don’t buy that they can be cleanly separated.

Plans can be more or less complicated, and can involve “I don’t know how this part works, but it worked last time, so lets do this” and what worked last time can be very pleasurable and rewarding - so it doesn’t seem to break down cleanly into any one category.

I’d also argue that, to the extent that abstract planning is successful, it is because it propagates top down and affects the lower pavlovian systems. If your thoughts about your project aren’t associated with motivation and wanting to actually do something, then your abstract plans aren’t of much use. It just isn’t salient that this happening unless the process is disrupted and you find yourself not doing what you “want” to do.

Another point that is worth stating explicitly is that algorithms for maximizing utility are not utility functions. In theory, you could have 3 different optimizers that all maximize the same utility function, or 3 different utility functions that all use the same optimizer - or any other combination.

I don’t think this is a purely academic distinction either - I think that we have conflicts at the same level all the time (multiple personality disorder being an extreme case). Conflicts between systems with no talk at between levels look like someone saying they want something, and then doing another without looking bothered at all. When someone is obviously pained by the conflict, then they are clearly both operating on an emotional level, even if the signal originated at different places. Or I could create a pavlovian conflict in my dog by throwing a steak on the wet tile, and watching as his conditioned fear of wet tile fights the conditioned desire of steak.

Comment author: Vladimir_Nesov 23 January 2012 08:42:40PM *  3 points [-]

It seems to me that the actual situation is that upon reflection we would clearly reject (most of) the outputs of all three systems. What human brain actually computes, in any of its modules or in all of them together, is not easily converted into considerations about how the decisions should be made.

In other words, the valuations made by human valuation systems are irrelevant, even though the only plausible solution involves valuations based on human valuation systems. And converting brains into definitions of value will likely break any other abstractions about the brains that theorize them as consisting of various modules with various purposes.

Comment author: lukeprog 23 January 2012 09:02:22PM *  1 point [-]

I said that " it seems that upon reflection I would embrace an extrapolation of the model-based system's preferences as representing 'my values'."

Which does, in fact, mean that I would reject "most of the outputs of all three systems."

Note: I've since changed "would" to "might" in that sentence.

Comment author: Vladimir_Nesov 23 January 2012 09:14:00PM *  1 point [-]

I said that " it seems that upon reflection I would embrace an extrapolation of the model-based system's preferences as representing 'my values'."

OK, didn't notice that; I was referring more to the opening dialog. Though "extrapolation" still doesn't seem to fit, because brain "modules" are not the same kind of thing as goals. Two-step process where first you extract "current preferences" and then "extrapolate" them is likely not how this works, so positing that you get the final preferences somehow starting from the brains is weaker (and correspondingly better, in the absence of knowledge of how this is done).

Comment author: lukeprog 23 January 2012 10:03:16PM 0 points [-]

I agree that the two-step process may very well not work. This is an extremely weak and preliminary result. There's a lot more hacking at the edges to be done.

Comment author: Vladimir_Nesov 23 January 2012 10:43:59PM 1 point [-]

I agree that the two-step process may very well not work. This is an extremely weak and preliminary result.

What are you referring to by "this" in the second sentence? I don't think there is a good reason to posit the two-step process, so if this is what you refer to, what's the underlying result, however weak and preliminary?

Comment author: lukeprog 23 January 2012 10:49:46PM 0 points [-]

By "this" I meant the content of the OP about the three systems that contribute to choice.

Comment author: Vladimir_Nesov 23 January 2012 10:55:29PM 0 points [-]

OK, in that case I'm confused, since I don't see any connection between the first and the second sentences...

Comment author: lukeprog 23 January 2012 10:59:08PM 2 points [-]

Let me try again:

Two-step process = (1) Extract preferences, (2) Extrapolate preferences. This may not work. This is one reason that this discovery about three valuation systems in the brain is so weak and preliminary for the purposes of CEV. I'm not sure it will turn out to be relevant to CEV at all.

Comment author: Vladimir_Nesov 23 January 2012 11:31:16PM *  5 points [-]

I see, so the two-step thing acts as a precondition. Is it right that you are thinking of descriptive idealization/analysis of human brain as a path that might lead to definition of "current" (extracted) preferences, which is then to be corrected by "extrapolation"? If so, that would clarify for me your motivation for hoping to get anything FAI-relevant out of neuroscience: extrapolation step would correct the fatal flaws of the extraction step.

(I think extrapolation step (in this context) is magic that can't work, and instead analysis of human brain must extract/define the right decision problem "directly", that is formally/automatically, without losing information during descriptive idealization performed by humans, which any object-level study of neuroscience requires.)

Comment author: lukeprog 24 January 2012 12:37:02AM 4 points [-]

Extraction + extrapolation is one possibility, though at this stage in the game it still looks incoherent to me. But sometimes things look incoherent before somebody smart comes along and makes them coherent and tractable.

Another possibility is that an FAI uploads some subset of humans and has them reason through their own preferences for a million subjective years and does something with their resulting judgments and preferences. This might also be basically incoherent.

Another possibility is that a single correct response to preferences falls out of game theory and decision theory, as Drescher attempts in Good and Real. This might also be incoherent.

Comment author: pjeby 23 January 2012 11:07:15PM 2 points [-]

I think you've also missed the possibility that all three "systems" might just be the observably inconsistent behavior of one system in different edge cases, or at least that the systems are far more entangled and far less independent than they seem.

(I think you may have also ignored the part where, to the extent that the model-based system has values, they are often more satisficing than maximizing.)

Comment author: FiftyTwo 23 January 2012 09:13:00PM 6 points [-]

I'm not sure I understand the difference between 2 and 3. The term pavlovian is being applied to the third system, but 2 sounds more like the archtypal pavlovian learned response (dog learns that bell results in food). Does 3 refer exclusively to pre-encoded pleasant/unpleasant responses rather than learned ones? Or is there maybe a distinction between a value and an action response that I'm missing?

Comment author: Swimmer963 23 January 2012 11:24:04PM 1 point [-]

It appears to me like 3 is only pre-encoded preferences, whereas 2 refers to preferences that are learned in an automatic, "reflex-like" way...which, yeah, sounds a lot like the Pavlovian learned response.

Comment author: Multiheaded 24 January 2012 05:35:37PM *  2 points [-]

'Ha ha! Love and friendship were actually in the other two!'"

This concern is not-abstract and very personal for me. As I've said around here before, I often find myself exhibiting borderline-sociopathic thinking in many situations, but the arrangement of empathy and ethical inhibitions in my brain, though off-kilter in many ways*, drives me to take even abstract ethical problems (LW examples: Three Worlds Collide, dust specks, infanticide, recently Moldbug's proposal of abolishing civil rights for the greater good) very personally, generates all kinds of strong emotions about them - yet it has kept me from doing anything ugly so far.

(The most illegal thing I've done in my life during the moments when I 'let myself go' was some petty and outwardly irrational shoplifting in my teenage years; reflecting back upon that, I did it not solely to get an adrenaline rush but also to push my internal equilibrium into a place where this "superego" thing would recieve an alarm and come back online)

What if this internal safety net of mine is founded solely upon #2 and #3?

(* As I've mentioned in some personal anecdotes, - and hell, I don't wish to drone on and on about this, just feeling it's relevant - this part of me has been either very weak or dormant until I watched Evangelion when I was 18. The weird, lingering cathartic sensation and the feeling of psychological change, which felt a little like growing up several years in a week, was the most interesting direct experience in my life so far. However, I've mostly been flinching from consciously trying to push myself towards the admirable ethics of interpersonal relations that I view as the director's key teaching. It's painful enough when it's happening without conscious effort on your part!)

Comment author: TheOtherDave 24 January 2012 06:05:18PM 2 points [-]

Do you have any particular reason for expecting it to be?

Or is this a more general "what if"? For example, if you contemplate moving to a foreign country, do you ask yourself what if your internal safety net is founded solely on living in the country you live in now?

Comment author: JoachimSchipper 25 January 2012 12:38:08PM *  2 points [-]

I'm not Multiheaded, but it feels-as-if the part of brain that does math has no problem at all personally slaughtering a million people if it saves one million and ten (1); the ethical injunction against that, which is useful, feels-as-if it comes from "avoid the unpleasant (c.q. evil) thing". (Weak evidence based on introspection, obviously.)

(1) Killing a million people is really unpleasant, but saving ten people should easily overcome that even if I care more about myself than about others.

Comment author: Multiheaded 26 January 2012 10:57:02PM 0 points [-]

Rougly that; I've thought about it in plenty more detail, but everything beyond this summary feels vague and I'm too lazy currently to make it coherent enough to post.

Comment author: Multiheaded 24 January 2012 06:07:53PM *  0 points [-]

Do you have any particular reason for expecting it to be?

It feels like I do, but it'll take a bit of very thoughtful writing to explicate why. So maybe I'll explain it here later.

Comment author: GabrielDuquette 24 January 2012 06:28:33PM 1 point [-]

push my internal equilibrium into a place where this "superego" thing would recieve an alarm and come back online

That sounds familiar. I've probably done the same in dozens of different ways over the course of my life, including shoplifting.

Comment author: AspiringKnitter 24 January 2012 08:33:31PM 1 point [-]

If I understand this correctly, then the model-based system and the model-free system sound like inside and outside views.

Comment author: Manfred 25 January 2012 03:40:52PM 1 point [-]

Although in this case the "outside view" can't learn from anybody else's mistakes, it always has to make them itself.

Comment author: lessdazed 25 January 2012 04:03:32AM 1 point [-]

I agree.

Whoever downvoted this should have said why they disagreed if they did.

Comment author: JoachimSchipper 25 January 2012 12:40:56PM 0 points [-]

My inside view already feels pretty probabilistic, actually. (I suspect LW-reading mathematicians are not a very good model of the average human, though.)

Comment author: [deleted] 05 August 2012 06:21:51AM *  1 point [-]

Suppose it turned out that humans violate the axioms of VNM rationality (and therefore don't act like they have utility functions) because there are three valuation systems in the brain that make conflicting valuations, and

A question, probably silly: Suppose you calculate what a person would do given every possible configuration of sensory inputs, and then construct a utility function that returns one if that thing is done and zero otherwise. Can't we then say that any deterministic action-taking thing acts according to some utility function?

Or, even more trivially, just let the utility be constant. Then any action maximizes utility.

Edit: If you're using utility functions to predict actions, then the constant utility function is like a maximum entropy prior, and the "every possible configuration" thing is like a hypothesis that simply lists all observations without positing some underlying pattern, so it would eventually get killed off by being more complicated than hypotheses that actually "compress" the evidence.

Comment author: RichardKennaway 05 August 2012 11:39:25AM *  1 point [-]

A question, probably silly: Suppose you calculate what a person would do given every possible configuration of sensory inputs, and then construct a utility function that returns one if that thing is done and zero otherwise. Can't we then say that any deterministic action-taking thing acts according to some utility function?

No, although this idea pops up often enough that I have given it a name: the Texas Sharpshooter Utility Function.

There are two things glaringly wrong with it. Firstly, it is not a utility function in the sense of VNM (proof left as an exercise). Secondly, it does not describe how anything works -- it is purely post hoc (hence the name).

Comment author: gaffa 23 January 2012 10:41:57PM *  1 point [-]

As a first reaction (and without being read up on the details), I'm very skeptical. Assuming these three systems are actually in place, I don't see any convincing reason why any one of them should be trusted in isolation. Natural selection has only ever been able to work on their compound output, oblivious to the role played by each one individually and how they interact.

Maybe the "smart" system has been trained to assign some particular outcome a value of 5 utilons, whereas we would all agree that it's surely and under all circumstances worth more than 20, because as it happens throughout evolution one of the other "dumb" systems has always kicked in and provided the equivalent of at least 15 utilons. If you then extract the first system bare and naked, it might deliver some awful outputs.

Comment author: mfb 05 February 2012 06:14:40PM 1 point [-]

As I understand it, the first system should be able to predict the result of the other two - if the brain knows a bit about how brains work.

While I don't know if the brain really has three different systems, I think that the basic idea is true: The brain has the option to rely on instincts, on "it worked before", or on "let's make a pro/contra list" - this includes any combination of the concepts.

The "lower" systems evolved before the "higher" ones, therefore I would expect that they can work as a stand-alone system as well (and they do in some animals).

Comment author: endoself 24 January 2012 04:09:52AM 0 points [-]

I'm not familiar with the theory beyond what Luke has posted, but I think only one system is active at a time, so there is no summation occurring. However, we don't yet know what determines which system makes a particular decision or how these systems are implemented, so there definitely could be problems isolating them.

Comment author: Deanushka 06 February 2012 10:31:15PM 0 points [-]

Just some initial thoughts,

I do understand that these statements are broad generalisations for what really does occur though the premise is that a successful choice would be made from wieighting options provided from the scenarios.

As with genetics and other systems the beneficial error scenario which can be described in situations such as a miskeyed note on a keyboard leading to a variation of the sequence that is favourable seems excluded from these scenarios.

Improvisation based on self introduced errors may also be a core to these utilities being able to evolve reason.

Model-based system: Figure out what's going on, and what actions maximize returns, and do them.

Model-free system: Do the thingy that worked before again!

Pavlovian system: Avoid the unpleasant thing and go to the pleasant thing. Repeat as necessary.

Comment author: mfb 29 January 2012 05:37:31PM 0 points [-]

I think that you can keep up the utility function a bit longer if you add the costs of thinking to it - required time and energy, and maybe aversion of thinking about it. "I could compare these two items in the supermarket for 30 minutes and finally find out which product is better - or I could just ignore the new option and take the same thing as last time". It can be the perfectly rational option to just stick with something which worked before.

It is also rational to decide how much time you invest to decide something (and if there is a lot of money involved, this is usually done). If the time for a decision is not enough to build and use a model, you fall back to more "primitive" methods. In fact, most of the everyday decisions have to be done like that. Each second, you have several options available, and no possibility to re-think about all of them every time.

We need all 3 systems for our life. The interesting thing is just to decide which system is useful for which decision and which time it should get. Look at it from a higher perspective, and you can get a well-defined utility function for a brain which has access to these systems to evaluate things.

Comment author: Dmytry 26 January 2012 10:51:27PM *  0 points [-]

Okay, which system decides which way the rat should turn when rat is navigating a maze? A cat doing actual path-finding on complex landscape? (which is surprisingly hard to do if you are coding a cat AI. Path finding, well, it is rather 'rational' in the sense that animals don't walk into the walls and the like) A human navigating a maze with a map to get food? A cat doing path finding avoiding a place where the cat had negative experience? ("conditioning").

It seems to me that those 3 'systems', if there are such 3 systems, aren't interacting in the way that article speaks of.

Comment author: TheOtherDave 23 January 2012 11:40:02PM 0 points [-]

At a glance, it seems that upon reflection I might embrace an extrapolation of the model-based system's preferences as representing "my values," and I would reject the outputs of the model-free and Pavlovian systems as the outputs of dumb systems that evolved for their computational simplicity, and can be seen as ways of trying to approximate the full power of a model-based system responsible for goal-directed behavior.

At a glance, I might be more comfortable embracing an extrapolation of the combination of the model-based system's preferences and the Pavlovian system's preferences.

Admittedly, a first step in extrapolating the Pavlovian system's preferences might be to represent its various targets as goals in a model, thereby leaving the extrapolator with a single system to extrapolate, but given that 99% of the work takes place after this point I'm not sure how much I care. Much more important is to not lose track of that stuff accidentally.

Comment author: timtyler 25 January 2012 01:48:08AM *  -1 points [-]

Suppose it turned out that humans violate the axioms of VNM rationality (and therefore don't act like they have utility functions) because there are three valuation systems in the brain that make conflicting valuations, and all three systems contribute to choice.

Er, I don't think so. To quote from here:

Utility maximisation is a general framework which is powerful enough to model the actions of any computable agent. The actions of any computable agent - including humans - can be expressed using a utility function. This was spelled out by Dewey in a 2011 paper titled: "Learning What to Value" - in his section about "O-Maximisers".

Some argue that humans have no utility function. However, this makes little sense: all computable agents have utility functions. The human utility function may not be easy to write down - but that doesn't mean that it doesn't exist.

Comment author: JoachimSchipper 25 January 2012 12:45:34PM 1 point [-]

Why would this necessarily be true? Somewhere in mind-design-space is a mind (or AI/algorithm) that confidently asserts A > B, B > C and C > A. (I'm not sufficiently versed in the jargon to know whether this mind would be an "agent", though - most minds are not goal-seeking in any real sense of the word.)

Comment author: timtyler 25 January 2012 12:49:51PM *  0 points [-]

That mind would have some associated behaviour and that behaviour could be expressed by a utility function (assuming computability - which follows from the Church–Turing–Deutsch principle).

Navel gazing, rushing around in circles, burning money, whatever - all have corresponding utility functions.

Dewey explains why in more detail - if you are prepared to follow the previously-provided link from here.

Comment author: JoachimSchipper 25 January 2012 01:53:40PM 2 points [-]

I've taken a look at the paper. If "outcomes" are things like "chose A", "chose B" or "chose C", the above mind is simply not an O-maximizer: consider a world with observations "I can choose between A and B/B and C/C and A" (equally likely, independent of any past actions or observations) and actions "take the first offered option" or "take the second offered option" (played for one round, for simplicity, but the argument works fine with multiple rounds); there is no definition of U that yields the described behaviour. (I'm aware that the paper asserts that "any agents [sic] can be written in O-maximizer form", but note that the paper may simply be wrong. It's clearly an unfinished draft, and no argument or proof is given.)

If outcomes are things like "chose A given a choice between A and B", which is not clear to me from the paper, then my mind is indeed an O-maximizer (that is, there is a definition of U such that an O-maximizer produces the same outputs as my mind). However, as I understand it, you have also encoded any cognitive errors in the utility function: if a mind can be Dutch-booked into a undesirable state, the associated O-maximizer will have to act on a U function that values this undesirable state highly if it comes about as a result of being Dutch-booked. (Remember, the O-maximizer maximizes U and behaves like the original mind.) As an additional consideration, most decision/choice theory seems to assume a ranking of outcomes, not (path, outcome) pairs.

Comment author: timtyler 25 January 2012 03:30:01PM *  1 point [-]

I've taken a look at the paper. If "outcomes" are things like "chose A", "chose B" or "chose C", the above mind is simply not an O-maximizer: consider a world with observations "I can choose between A and B/B and C/C and A" (equally likely, independent of any past actions or observations) and actions "take the first offered option" or "take the second offered option" (played for one round, for simplicity, but the argument works fine with multiple rounds); there is no definition of U that yields the described behaviour.

What?!? You haven't clearly specified the behaviour of the machine. If you are invoking an uncomputable random number generator to produce an "equally likely" result then you have an uncomputable agent. However, there's no such thing as an uncomputable random number generator in the real world. So: how is this decision actually being made?

I'm aware that the paper asserts that "any agents [sic] can be written in O-maximizer form", but note that the paper may simply be wrong. It's clearly an unfinished draft, and no argument or proof is given.

It applies to any computable agent. That is any agent - assuming that the Church–Turing–Deutsch principle is true.

The argument given is pretty trivial. If you doubt the result, check it - and you should be able to see if it is correct or not fairly easily.

Comment author: JoachimSchipper 25 January 2012 04:57:55PM *  0 points [-]

The world is as follows: each observation x_i is one of "the mind can choose between A and B", "the mind can choose between B and C" or "the mind can choose between C and A" (conveniently encoded as 1, 2 and 3). Independently of any past observations (x_1 and the like) and actions (x_1 and the like), each of these three options is equally likely. This fully specifies a possible world, no?

The mind, then, is as follows: if the last observation is 1 ("A and B"), output "A"; if the last observation is 2 ("B and C"), output "B"; if the last observation is 3 ("C and A"), output "C". This fully specifies a possible (deterministic, computable) decision procedure, no? (1)

I argue that there is no assignment to U("A"), U("B") and U("C") that causes an O-maximizer to produce the same output as the algorithm above. Conversely, there are assignments to U("1A"), U("1B"), ..., U("3C") that cause the O-maximizer to output the same decisions as the above algorithm, but then we have encoded our decision algorithm into the U function used by the O-maximizer (which has its own issues, see my previous post.)

(1) Actually, the definition requires the mind to output something before receiving input. That is a technical detail that can be safely ignored; alternatively, just always output "A" before receiving input.

Comment author: timtyler 25 January 2012 06:13:30PM *  2 points [-]

I argue that there is no assignment to U("A"), U("B") and U("C") that causes an O-maximizer to produce the same output as the algorithm above.

...but the domain of a utility function surely includes sensory inputs and remembered past experiences (the state of the agent). You are trying to assign utilities to outputs.

If you try and do that you can't even encode absolutely elementary preferences with a utility function - such as: I've just eaten a peanut butter sandwich, so I would prefer a jam one next.

If that is the only type of utility function you are considering, it is no surprise that you can't get the theory to work.

Comment author: Manfred 25 January 2012 03:39:54PM 0 points [-]

The point is about how humans make decisions, not about what decisions humans make.

Comment author: timtyler 25 January 2012 06:30:35PM *  0 points [-]

The point is about how humans make decisions, not about what decisions humans make.

Er, what are you talking about? Did you not understand what was wrong with Luke's sentence? Or what are you trying to say?

Comment author: Manfred 25 January 2012 07:39:29PM 4 points [-]

The way I know to assign a utility function to an arbitrary agent is to say "I assign what the agent does utility 1, and everything else utility less than one." Although this "just so" utility function is valid, it doesn't peek inside the skull - it's not useful as a model of humans.

What I meant by "how humans make decisions" is a causal model of human decision-making. The reason I wouldn't call all agents "utility maximizers" is because I want utility maximizers to have a certain causal structure - if you change the probability balance of two options and leave everything else equal, you want it to respond thus. As gwern recently reminded me by linking to that article on Causality, this sort of structure can be tested in experiments.

Comment author: timtyler 25 January 2012 08:40:54PM *  2 points [-]

Although this "just so" utility function is valid, it doesn't peek inside the skull - it's not useful as a model of humans.

It's a model of any computable agent. The point of a utility-based framework capable of modelling any agent is that it allows comparisons between agents of any type. Generality is sometimes a virtue. You can't easily compare the values of different creatures if you can't even model those values in the same framework.

The reason I wouldn't call all agents "utility maximizers" is because I want utility maximizers to have a certain causal structure - if you change the probability balance of two options and leave everything else equal, you want it to respond thus.

Well, you can define your terms however you like - if you explain what you are doing. "Utility" and "maximizer" are ordinary English words, though.

It seems to be impossible to act as though you don't have a utility function, (as was originally claimed) though. "Utility function" is a perfectly general concept which can be used to model any agent. There may be slightly more concise methods of modelling some agents - that seems to be roughly the concept that you are looking for.

So: it would be possible to say that an agent acts in a manner such that utility maximisation is not the most parsimonious explanation of its behaviour.

Comment author: Manfred 26 January 2012 01:23:58AM 2 points [-]

Although this "just so" utility function is valid, it doesn't peek inside the skull - it's not useful as a model of humans.

It's a model of any computable agent.

Sorry, replace "model" with "emulation you can use to predict the emulated thing."

There may be slightly more concise methods of modelling some agents - that seems to be roughly the concept that you are looking for.

I'm talking about looking inside someone's head and finding the right algorithms running. Rather than "what utility function fits their actions," I think the point here is "what's in their skull?"

Comment author: timtyler 05 August 2012 12:30:12PM -1 points [-]

I'm talking about looking inside someone's head and finding the right algorithms running. Rather than "what utility function fits their actions," I think the point here is "what's in their skull?"

The point made by the O.P. was:

Suppose it turned out that humans violate the axioms of VNM rationality (and therefore don't act like they have utility functions)

It discussed actions - not brain states. My comments were made in that context.