Basics of Human Reinforcement

Scott Alexander

Today: some more concepts from reinforcement learning and some discussion on their applicability to human behavior.

For example: most humans do things even when they seem unlikely to result in delicious sugar water. Is this a violation of behaviorist principles?

No. For one thing, yesterday's post included a description of secondary reinforcers, those reinforcers which are not hard-coded evolutionary goods like food and sex, but which nevertheless have a conditioned association with good things. Money is the classic case of a secondary reinforcer among humans. Little colored rectangles are not naturally reinforcing, but from a very young age most humans learn that they can be used to buy pleasant things, like candy or toys or friends. Behaviorist-inspired experiments on humans often use money as a reward, and have yet to run into many experimental subjects whom it fails to motivate¹.

Speaking of friends, status may be a primary reinforcer specific to social animals. I don't know if being able to literally feel reinforcement going on is a real thing, but I maintain I can feel the rush of reward when someone gives me a compliment. If that's too unscientific for you, consider studies in which monkeys will "exchange" sugary juice for the opportunity to look at pictures of high status monkeys, but demand extra juice in exchange for looking at pictures of low status monkeys.

Although certain cynics might consider money and status an exhaustive list, we may also add moral, aesthetic, and value-based considerations. Evolutionary psychology explains why these might exist and Bandura called some of them "internal reinforcement".

But more complicated reinforcers alone are not sufficient to bridge the gap between lever-pushing pigeons and human behavior. Humans have an ability to select for or against behaviors without trying them. For example: most of us would avoid going up to Mr. T and giving him the finger. But most of us have not personally tried this behavior and observed the consequences.

Is this the result of pure reason? No; the rational part of our mind is the part telling us that Mr. T is probably sixty years old by now and far too deep in the media spotlight to want to risk a scandal and jail time by beating up a random stranger. So where exactly is the reluctance coming from?

GENERALIZATION

Roko wrote in his post Ugh Fields that "your brain propagates psychological pain back to the earliest reliable stimulus for the punishment". This deserves more investigation.

Suppose you did go into a bar one night, see Mr. T, give him the finger, and get beaten up. What behavior would you avoid in the future based on this experience? The event itself does not immediately provide enough information to distinguish among "don't go into bars", "don't go out at night", "don't interact with people who have facial hair", and the correct answer "don't offend scary-looking people". This information has to come from your pre-existing model of reality, your brain's evolved background assumptions, and some clever guesswork.

Let's get back to the hilariously unethical experiments. Little Albert was an eight month old child who briefly starred in an experiment by behaviorist John Watson. Watson showed him a fuzzy white rat. Albert seemed to like the rat well enough. After Albert liking the rat had been confirmed, Watson showed him the rat again, but this time also played a very loud and scary noise; he repeated this intervention until, as expected, Albert was terrified of the white rat.

But it wasn't just fuzzy white rats Albert didn't like. Further investigation determined that Albert was also afraid of brown rabbits (fuzzy animal) and Santa Claus (fuzzy white beard). With his incipient powers of categorization, he had learned to associate punishment with a broad category of things vaguely clustered around fuzzy white rats.

B.F. Skinner had an even more interesting experiment that showed what happened when feedback of consequences went wrong. He put pigeons in a box that gave them rewards randomly. The pigeons ended up developing what he called "superstitions"; if a reward arrived by coincidence when a pigeon was tilting its head in a certain direction, the pigeon would continue tilting its head in that direction in the hope of gaining more rewards; when the reward randomly arrived, the pigeon took this as "justification" of its head-tilting and head-tilted even more².

This provides one piece of the puzzle in the Mr. T question. None of us have ever given Mr. T the finger before. But we may have offended scary-looking people and had bad things happen to us, which our brains correctly generalize to "don't offend scary-looking people".

SOCIAL LEARNING

Or maybe not. Maybe you've never offended a scary-looking person before. what then?

Social learning theory is held up as opposed to behaviorism a lot, but it seems more like a natural extension of it. Humans and animals learn behaviors not just by being rewarded or punished themselves, but in observing whether a behavior is rewarded or punished in others.

Even if we ourselves have never offended scary-looking people, we have seen other people do so, or heard stories about people doing so, or watched people do so on TV.

At this point I have to mention my favorite social learning story ever, which also illustrates the pitfalls of trying to feedback consequences to their proximal cause. There has been some hand-wringing lately about children's TV shows and whether they lead to developmental problems in children. A study by Ostrov and Gentile cited in NurtureShock found the expected correlation between violent TV shows and physical aggression, but also found a an even stronger correlation between educational TV shows and so-called "relational aggression" - things like bullying, name-calling, and deliberate ostracism. The shows most strongly correlated with bad behavior were heart-warming educational programs intended to teach morality. Why?

The researchers theorize that the structure of these shows often involved a child committing an immoral action, the child looking cool and strong, and then at the end of the show the child eventually gets a comeuppance (think Harry Potter, where evil character Draco Malfoy is the coolest and most popular kid in Hogwarts and usually gets away with it, whereas supposedly sympathetic character Ron Weasley is at best a lovable loser who spends most of his time as the butt of Draco's jokes). The theory is that children are just not good enough at the whole feedback of conseqeunces thing to realize that the bully's comeuppance in the end is supposed to be the inevitable result of their evil ways. All they see is someone being a bully and then being treated as obviously popular and high-status.

Behavior is selection by consequences, and status is a strong reinforcer. If children see other children behaving as bullies and having high status, then all else being equal, they will be more likely to behave as bullies.

These two phenomena - feedback to categories and social learning - go part of the way to explaining the original question of how people have strong preferences for or against behaviors they've never tried before.

INTERNAL REINFORCEMENT

The phrase "internal reinforcement" would make good behaviorists cringe, seeing as it takes a perfectly good predictive model of behavior and tries to pin it on invisible mental phenomena.

But all reinforcement has to be at least a little internal; an animal wouldn't know that eating food was good and eating rocks was bad unless some internal structure knew to reinforce food-eating behavior but not rock-eating behavior. Some reinforcement seems even more internal than that; people may continue an activity solely because it makes them feel good about themselves.

This is not any more mysterious than eating behavior - the drive for food and the drive for status as measured in self-esteem are both perfectly legitimate biological drives, and it's not surprising that we have structures for noticing when we satisfy them and reinforcing the behavior involved - but it sure does sound less scientific.

STILL NOT GOOD ENOUGH

Much to the chagrin of behaviorists, all these mechanisms are still not sufficient to completely explain human behavior. Some cases - for example a patient who quits an enjoyable smoking habit because the doctor says it will cause cancer - may not fit any of these patterns. The patient may not previously have encountered any problems, personally or vicariously, with smoking or anything sufficiently similar to smoking to justify generalization, and positing internal reinforcement just moves the problem to another level.

Daniel Dennett speaks of

a sort of inner environment, in which tryouts can be safely executed - an inner something-or-other structured in such a way that the surrogate actions it favors are more often than not the very actions the real world would also bless, if they were actually performed. In short, the inner environment, whatever it is, must contain lots of information about the outer environment and its regularities. Nothing else (except magic) could provide preselection worth having.

There is some evidence for this sort of thing in certain cases: in experiments on fictive reinforcement, people who stayed out of a simulated rising stock market, thus breaking even when they could have won a lot of money, were found on MRI to have a reinforcement signal almost as if they were simulating the case in which they had entered the stock market and been reinforced for doing so.

But overall this idea involves too much magic and doesn't correspond to the way we really make decisions, either as perceived intuitively or as detected by most experiments. It also doesn't explain why we're so bad at being motivated by this sort of reinforcement: for example, since I know that heroin is really really enjoyable, why can't I become addicted to heroin just by thinking about it? And how come the overwhelming majority of patients don't quit smoking when their doctor tells them to do so, but people often do quit smoking after they've personally experienced the negative consequences (eg had their first heart attack)?

I am more favorable to the idea of a neural net model in which medical advice can forge a weak connection between the "smoking" pattern and the "cancer" pattern through cognition alone, separate from reinforcement processes but allowing such processes to propagate down it. Not a whole lot of motivational force can travel down such a weak link, blocking it from being effective against a strong desire to keep smoking. But I've got to admit that's a wild guess.

The important point, though, is that just as utility theory posits not just utility but expected utility, reinforcement learning posits not just reward but expected reward. Many processes by which we compute expected reward remain vague. Others have been explored in some detail. The next two posts will make up for the vagueness of this one by discussing some properties of the expected reward function.

FOOTNOTES:

1. Humans are not the only species that can become attracted to secondary reinforcers; monkeys have been successfully trained to use currency.

2: You can see the same effect at work in human athletes. If a certain behavior correlates with a winning streak, they will continue that behavior no matter how unlikely a causal link. But these athletes are curiosities precisely because people are so good at feeding back consequences to the correct stimulus.

I am more favorable to the idea of a neural net model in which medical advice can forge a weak connection between the "smoking" pattern and the "cancer" pattern through cognition alone, separate from reinforcement processes but allowing such processes to propagate down it. Not a whole lot of motivational force can travel down such a weak link, blocking it from being effective against a strong desire to keep smoking.

On the other hand, as the experience with smoking bans in recent years has shown, the threat of a modest fine (perhaps also with some shaming involved) is enough to induce smokers to refrain from smoking for long periods of time -- even before they've paid any fines. This is also true for smokers who otherwise swear that they are helpless addicts, unable to quit despite their best efforts. So assuming your hypothesis is true, a strong link between the "smoking" pattern and the "legal penalty"/"social opprobrium" patterns is established very easily. It's an interesting question why doctors' advice fails to have a similar effect, and how much this is due to rational thinking (or plausible rationalizations) involved in the model used in the internal reinforcement there.

As for heroin, obviously it evokes not just the image of a pleasurable high, but also the image of emaciated homeless junkies. The latter you can imagine in all its awfulness even if you've never experienced it, but you can't imagine anything like the feeling of pleasure just from a verbal description. So it makes sense that you might be tempted to become (and remain) a junkie once you've tasted heroin, but as long as you haven't, the negative reinforcement is much stronger.

The latter you can imagine in all its awfulness even if you've never experienced it, but you can't imagine anything like the feeling of pleasure just from a verbal description

This is also the answer to the larger question: we may suppose that a person who clearly visualizes just how bad the cancer would be will be more likely to quit smoking. This seems a fairly testable prediction, since it would imply a greater likelihood of someone quitting smoking if someone they know has suffered lung cancer as a result of their smoking. It would also imply that people instructed to vividly imagine various graphic details of themselves, diseased, every time they want a cigarette, would be more likely to successfully avoid smoking.

The difference in individuals would then quite clearly be attributable to differences in what the person was doing internally when they heard the doctor tell them to quite. Some people more vividly imagine things than others, some have better reference memories for imagining with, some have different prior probabilities for how likely they are to be sick (thus influencing their ability to "see themselves" in that state), etc.

In contrast, the pain of a fine or reproach is quite easy to imagine and is so closely associated with bans of anything that it isn't even necessary to intentionally imagine it in order for it to work as a reinforcer.

It also doesn't explain why we're so bad at being motivated by this sort of reinforcement:

The explanation is that people that are bad at it don't run around in their mental playground enough.

Phobias, for example are actually pretty easy to fix. The person with a phobia may know that "baked beans are harmless", but that alone doesn't stop him from losing his shit when he see's a can of baked beans. You can show him how to imagine being comfortable playing with baked beans in a few minutes, and the phobia is gone.

for example, since I know that heroin is really really enjoyable, why can't I become addicted to heroin just by thinking about it?

If you really wanted to, you could. You don't though, so you won't let yourself go down that path.

Phobias [...] are actually pretty easy to fix.

[...] in a few minutes, and the phobia is gone.

I'm interested. I am afraid of a specific thing (which I don't mention because I have a weird and almost certainly baseless fear someone might use it against me), and though I don't encounter it often (and am not afraid I might encounter it when it's unlikely I will, so it's not a severe phobia) I'd rather not be so afraid. I can easily imagine happily playing with it, but it doesn't help in real life. What am I doing wrong?

If you really wanted to, you could.

I don't think that's true. Not sure how to test it.

I'm interested. I am afraid of a specific thing (which I don't mention because I have a weird and almost certainly baseless fear someone might use it against me), and though I don't encounter it often (and am not afraid I might encounter it when it's unlikely I will, so it's not a severe phobia) I'd rather not be so afraid. I can easily imagine happily playing with it, but it doesn't help in real life. What am I doing wrong?

Maybe your imagination isn't vivid enough in the right way, or maybe you're imagining it with the additional nonverbal thought "Yeah, but that's not real. I could't possibly actually play with it without being scared". Do you know how to imagine it and get scared? If you want, we could gchat or skype or something and see if we can fix this.

I don't think that's true. Not sure how to test it.

It was a bit of an overstatement. The actual experience of using heroin will be a stronger motivator than the imagined experience, but the bigger part is that he currently has negative associations with heroin and will act in ways to keep those negative associations. In theory, you could probably pay him enough money and promise him good enough rehab later that he'll want the addiction to get the reward, but it's not a very practical test.

the additional nonverbal thought "Yeah, but that's not real. I could't possibly actually play with it without being scared".

Possibly. When I imagine approaching and playing with it in a controlled setting, I imagine myself being nervous, but not scared. I expect desensitization therapy would work, but only if I actually did it, not merely imagined doing it.

Do you know how to imagine it and get scared?

I can imagine, or remember, being scared, but I'm not actually scared when I imagine. (Can people normally make themselves scared by imagining things?)

you could probably pay him enough money and promise him good enough rehab later that he'll want the addiction to get the reward

Yeah, want the addiction. That yields "Aw crap, I had to inject heroin today but I spent all day on Reddit instead. Should set a timer to remind me next time.", which isn't the behavior exhibited by someone currently addicted.

Possibly. When I imagine approaching and playing with it in a controlled setting, I imagine myself being nervous, but not scared. I expect desensitization therapy would work, but only if I actually did it, not merely imagined doing it.

Imagining doing it works for other people. If you can figure out how to imagine like them, it'll work for you too.

(Can people normally make themselves scared by imagining things?) Yep. Some people get more scared than others. Last time I tried I got my heart rate to raise 10bpm, which is significant but much less than terrified.

Yeah, want the addiction. That yields "Aw crap, I had to inject heroin today but I spent all day on Reddit instead. Should set a timer to remind me next time.", which isn't the behavior exhibited by someone currently addicted.

No, that is not what I'm talking about. You don't get paid for injecting, you get paid for credibly convincing people with fMRIs that you crave it. If you have akrasia problems about shooting up heroin it might turn out like you say, but if you think about it in the way that gets your motivational systems going, you'll start to crave it.

(Can people normally make themselves scared by imagining things?)

That's pretty much how severe phobias happen, yeah.

And how come the overwhelming majority of patients don't quit smoking when their doctor tells them to do so, but people often do quit smoking after they've personally experienced the negative consequences (eg had their first heart attack)?

It seems like the obvious answer is "because the experience of abstract words from their doctor isn't vivid enough to trigger the reinforcement machinery, but the experience of having a heart attack is."

The researchers theorize that the structure of these shows often involved a child committing an immoral action, the child looking cool and strong, and then at the end of the show the child eventually gets a comeuppance (think Harry Potter, where evil character Draco Malfoy is the coolest and most popular kid in Hogwarts and usually gets away with it, whereas supposedly sympathetic character Ron Weasley is at best a lovable loser who spends most of his time as the butt of Draco's jokes). The theory is that children are just not good enough at the whole feedback of conseqeunces thing to realize that the bully's comeuppance in the end is supposed to be the inevitable result of their evil ways. All they see is someone being a bully and then being treated as obviously popular and high-status.

Alternative explanation: They didn't watch till the end? I find it hard to tell from the paper whether they only included cases where they watched the whole way through.

I am more favorable to the idea of a neural net model in which medical advice can forge a weak connection between the "smoking" pattern and the "cancer" pattern through cognition alone, separate from reinforcement processes but allowing such processes to propagate down it. Not a whole lot of motivational force can travel down such a weak link, blocking it from being effective against a strong desire to keep smoking.

The latter you can imagine in all its awfulness even if you've never experienced it, but you can't imagine anything like the feeling of pleasure just from a verbal description

It also doesn't explain why we're so bad at being motivated by this sort of reinforcement:

The explanation is that people that are bad at it don't run around in their mental playground enough.

for example, since I know that heroin is really really enjoyable, why can't I become addicted to heroin just by thinking about it?

If you really wanted to, you could. You don't though, so you won't let yourself go down that path.

Phobias [...] are actually pretty easy to fix.

[...] in a few minutes, and the phobia is gone.

If you really wanted to, you could.

I don't think that's true. Not sure how to test it.

I'm interested. I am afraid of a specific thing (which I don't mention because I have a weird and almost certainly baseless fear someone might use it against me), and though I don't encounter it often (and am not afraid I might encounter it when it's unlikely I will, so it's not a severe phobia) I'd rather not be so afraid. I can easily imagine happily playing with it, but it doesn't help in real life. What am I doing wrong?

I don't think that's true. Not sure how to test it.

the additional nonverbal thought "Yeah, but that's not real. I could't possibly actually play with it without being scared".

Do you know how to imagine it and get scared?

I can imagine, or remember, being scared, but I'm not actually scared when I imagine. (Can people normally make themselves scared by imagining things?)

you could probably pay him enough money and promise him good enough rehab later that he'll want the addiction to get the reward

Possibly. When I imagine approaching and playing with it in a controlled setting, I imagine myself being nervous, but not scared. I expect desensitization therapy would work, but only if I actually did it, not merely imagined doing it.

Imagining doing it works for other people. If you can figure out how to imagine like them, it'll work for you too.

(Can people normally make themselves scared by imagining things?) Yep. Some people get more scared than others. Last time I tried I got my heart rate to raise 10bpm, which is significant but much less than terrified.

Yeah, want the addiction. That yields "Aw crap, I had to inject heroin today but I spent all day on Reddit instead. Should set a timer to remind me next time.", which isn't the behavior exhibited by someone currently addicted.

(Can people normally make themselves scared by imagining things?)

That's pretty much how severe phobias happen, yeah.

And how come the overwhelming majority of patients don't quit smoking when their doctor tells them to do so, but people often do quit smoking after they've personally experienced the negative consequences (eg had their first heart attack)?

The researchers theorize that the structure of these shows often involved a child committing an immoral action, the child looking cool and strong, and then at the end of the show the child eventually gets a comeuppance (think Harry Potter, where evil character Draco Malfoy is the coolest and most popular kid in Hogwarts and usually gets away with it, whereas supposedly sympathetic character Ron Weasley is at best a lovable loser who spends most of his time as the butt of Draco's jokes). The theory is that children are just not good enough at the whole feedback of conseqeunces thing to realize that the bully's comeuppance in the end is supposed to be the inevitable result of their evil ways. All they see is someone being a bully and then being treated as obviously popular and high-status.

Alternative explanation: They didn't watch till the end? I find it hard to tell from the paper whether they only included cases where they watched the whole way through.

LESSWRONG
LW

LESSWRONG
LW

57

Basics of Human Reinforcement

57

57

57