LESSWRONG
is fundraising!
LW

I very much like this post, as it advances understanding of our human decision-making, but the conclusion is misleading. Expected utility maximization is a poor tool for describing human behavior, but the motivation for the idea of normative preference doesn't go anywhere. There can only be one way the world (future) actually is, so we must form a specific idea about what to make it, even if human psychology doesn't naturally contain or support such an idea.

[-]Wei Dai14y60

I wonder if Yvain is just making a descriptive, rather than normative, conclusion, i.e., that "preferences" is not a good way to model how humans actually behave.

(If the conclusion is meant to be descriptive, I would reply that once we have powerful tools for self modification, at least some humans will actually self modify into being expected utility maximizers, or whatever the correct decision theory is, so "preferences" will be a good way to model how (some) humans actually behave. And if mind-copying becomes possible and evolution by natural selection continues, there will be strong selection pressure away from reinforcement learning agents, because they do not do well under a mind-copying-enabled environment, compared to, say, UDT agents. Reinforcement learning agents only care about their own future rewards, but evolution favors agents that care about their copy-siblings equally.)

I wish Yvain had telegraphed his overall conclusions for the sequence ahead of time, because that would help to immediately clarify this and any other ambiguities that might arise. If it weren't for your comment, I probably wouldn't have replied to this post, due to the ambiguity. (I could ask for clarification, but that involves delaying the reward of knowing that I've made a substantive and relevant point. Or I could respond to both possible meanings, but I'd have to expend twice the effort for the same reward.)

[-]Vladimir_Nesov14y00

Perhaps Yvain is just making a descriptive, rather than normative, conclusion, i.e., that "preferences" is not a good way to model how humans actually behave.

I expect he indeed means it this way, but it sounds ambiguous, which is what I meant by "misleading".

[-]Wei Dai14y20

In that case, I suggest "ambiguous" or "potentially misleading", since it's not clear that anyone has actually been mislead yet. (On the other hand, your own comment is "actually misleading" since it made me think that you thought that Yvain was making a normative point and by "misleading" you were indicating disagreement. :)

[-]timtyler14y40

This deals yet another blow to the concept of me having "preferences".

I think I probably missed the first blows. I find the concept of "preferences" unproblematical. The idea that the magnitude of a preference for eating popcorn is location-dependent seems perfectly ordinary to me. Perhaps work harder on seeking a sympathetic interpretation for "preference" talk.

[-]n00bodyknows7y10

I like that you used 'The Malazan Book of the Fallen' as an example of a truly great novel ;)

[-]Arkanj3l13y10

It's a hyperbolically discounted tradeoff that you make without realizing it, because the cost you're refusing to pay isn't commensurate enough with the payoff you're forgoing to be salient as an explicit tradeoff.

I can't parse this. Can someone explain it to me?

[-]MatthewBaker14y10

Since we know our preferences and utility's are formed from a combination of genetics and experiences we start to realize that a Alien God had great influence on our genetic "wants" or preferences. Therefore as we transform mentally using the master lifehack and understand the way our mind works better we can distinguish which genetic preferences are useful to our sanity and utility, and which ones we should try to life-hack our way out of.

However, in a post positive singularity world where someone can manipulate their body and mind as they please where do we draw the line for sentience? One of my overlying questions has always been if we do successfully make it to a positive singularity, what differentiates between life that we should let grow naturally and life that we shouldn't give the ability to choose. Just because we are one step ahead of our animal brothers in the evolutionary ladder, does that mean we can live our lives according to our every wants while they spend time fighting in their natural habitat? Do we help them advance to sentience and if so where do we draw the line?

Harry has many conflicts with the line between sub-sentient/sentient life in HPMOR, but everyone including Herminone fails to understand his distress at the issue. The only reason my morality allows me to lay this issue on the back-burner is because right now all we can do is speculate about these issues for mental relaxation and pleasure. Harry had magic which he knew could set right which was once set wrong and the topic nearly consumed him before he made the decision to let a rationalization stand for sake of humanity. Once we have a positive singularity and we are done with "crunch time" we can decide what to do with sub-sentient life, hopefully as we advance intellectually the solution will become more apparent.

[+]timtyler14y-70

Moderation Log

More from Scott Alexander

Curated and popular this week

9Comments

In Are Wireheads Happy? I discussed the difference between wanting something and liking something. More recently, Luke went deeper into some of the science in his post Not for the Sake of Pleasure Alone.

In the comments of the original post, cousin_it asked a good question: why implement a mind with two forms of motivation? What, exactly, are "wanting" and "liking" in mind design terms?

Tim Tyler and Furcas both gave interesting responses, but I think the problem has a clear answer in a reinforcement learning perspective (warning: formal research on the subject does not take this view and sticks to the "two different systems of different evolutionary design" theory). "Liking" is how positive reinforcement feels from the inside; "wanting" is how the motivation to do something feels from the inside. Things that are positively reinforced generally motivate you to do more of them, so liking and wanting often co-occur. With more knowledge of reinforcement, we can begin to explore why they might differ.

CONTEXT OF REINFORCEMENT

Reinforcement learning doesn't just connect single stimuli to responses. It connects stimuli in a context to responses. Munching popcorn at a movie might be pleasant; munching popcorn at a funeral will get you stern looks at best.

In fact, lots of people eat popcorn at a movie theater and almost nowhere else. Imagine them, walking into that movie theater and thinking "You know, I should have some popcorn now", maybe even having a strong desire for popcorn that overrides the diet they're on - and yet these same people could walk into, I don't know, a used car dealership and that urge would be completely gone.

These people have probably eaten popcorn at a movie theater before and liked it. Instead of generalizing to "eat popcorn", their brain learned the lesson "eat popcorn at movie theaters". Part of this no doubt has to do with the easy availability of popcorn there, but another part probably has to do with context-dependent reinforcement.

I like pizza. When I eat pizza, and get rewarded for eating pizza, it's usually after smelling the pizza first. The smell of pizza becomes a powerful stimulus for the behavior of eating pizza, and I want pizza much more after smelling it, even though how much I like pizza remains constant. I've never had pizza at breakfast, and in fact the context of breakfast is directly competing with my normal stimuli for eating pizza; therefore, no matter how much I like pizza, I have no desire to eat pizza for breakfast. If I did have pizza for breakfast, though, I'd probably like it.

INTERMITTENT REINFORCEMENT

If an activity is intermittently reinforced; occasional rewards spread among more common neutral stimuli or even small punishments, it may be motivating but unpleasant.

Imagine a beginning golfer. He gets bogeys or double bogeys on each hole, and is constantly kicking himself, thinking that if only he'd used one club instead of the other, he might have gotten that one. After each game, he can't believe that after all his practice, he's still this bad. But every so often, he does get a par or a birdie, and thinks he's finally got the hang of things, right until he fails to repeat it on the next hole, or the hole after that.

This is a variable response schedule, Skinner's most addictive form of delivering reinforcement. The golfer may keep playing, maybe because he constantly thinks he's on the verge of figuring out how to improve his game, but he might not like it. The same is true for gamblers, who think the next pull of the slot machine might be the jackpot (and who falsely believe they can discover a secret in the game that will change their luck; they don't like sitting around losing money, but they may stick with it so that they don't leave right before they reach the point where their luck changes.

SMALL-SCALE DISCOUNT RATES

Even if we like something, we may not want to do it because it involves pain at the second or sub-second level.

Eliezer discusses the choice between reading a mediocre book and a good book:

You may read a mediocre book for an hour, instead of a good book, because if you first spent a few minutes to search your library to obtain a better book, that would be an immediate cost - not that searching your library is all that unpleasant, but you'd have to pay an immediate activation cost to do that instead of taking the path of least resistance and grabbing the first thing in front of you. It's a hyperbolically discounted tradeoff that you make without realizing it, because the cost you're refusing to pay isn't commensurate enough with the payoff you're forgoing to be salient as an explicit tradeoff.

In this case, you like the good book, but you want to keep reading the mediocre book. If it's cheating to start our hypothetical subject off reading the mediocre book, consider the difference between a book of one-liner jokes and a really great novel. The book of one-liners you can open to a random page and start being immediately amused (reinforced). The great novel you've got to pick up, get into, develop sympathies for the characters, figure out what the heck lomillialor or a Tiste Andii is, and then a few pages in you're thinking "This is a pretty good book". The fear of those few pages could make you realize you'll like the novel, but still want to read the joke book. And since hyperbolic discounting overcounts reward or punishment in the next few seconds, it may seem like a net punishment to make the change.

SUMMARY

This deals yet another blow to the concept of me having "preferences". How much do I want popcorn? That depends very much on whether I'm at a movie theater or a used car dealership. If I browse Reddit for half an hour because it would be too much work to spend ten seconds traveling to the living room to pick up the book I'm really enjoying, do I "prefer" browsing to reading? Which has higher utility? If I hate every second I'm at the slot machines, but I keep at them anyway so I don't miss the jackpot, am I a gambling addict, or just a person who enjoys winning jackpots and is willing to do what it takes?

In cases like these, the language of preference and utility is not very useful. My anticipation of reward is constraining my behavior, and different factors are promoting different behaviors in an unstable way, but trying to extract "preferences" from the situation is trying to oversimplify a complex situation.