The Power of Reinforcement

96 Post author: lukeprog 21 June 2012 01:42PM

Part of the sequence: The Science of Winning at Life

Also see: Basics of Animal Reinforcement, Basics of Human Reinforcement, Physical and Mental Behavior, Wanting vs. Liking Revisited, Approving reinforces low-effort behaviors, Applying Behavioral Psychology on Myself.

 

Story 1:

On Skype with Eliezer, I said: "Eliezer, you've been unusually pleasant these past three weeks. I'm really happy to see that, and moreover, it increases my probability than an Eliezer-led FAI research team will work. What caused this change, do you think?"

Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."

 

Story 2:

I once witnessed a worker who hated keeping a work log because it was only used "against" him. His supervisor would call to say "Why did you spend so much time on that?" or "Why isn't this done yet?" but never "I saw you handled X, great job!" Not surprisingly, he often "forgot" to fill out his worklog.

Ever since I got everyone at the Singularity Institute to keep work logs, I've tried to avoid connections between "concerned" feedback and staff work logs, and instead take time to comment positively on things I see in those work logs.

 

Story 3:

Chatting with Eliezer, I said, "Eliezer, I get the sense that I've inadvertently caused you to be slightly averse to talking to me. Maybe because we disagree on so many things, or something?"

Eliezer's reply was: "No, it's much simpler. Our conversations usually run longer than our previously set deadline, so whenever I finish talking with you I feel drained and slightly cranky."

Now I finish our conversations on time.

 

Story 4:

A major Singularity Institute donor recently said to me: "By the way, I decided that every time I donate to the Singularity Institute, I'll set aside an additional 5% for myself to do fun things with, as a motivation to donate."


The power of reinforcement

It's amazing to me how consistently we fail to take advantage of the power of reinforcement.

Maybe it's because behaviorist techniques like reinforcement feel like they don't respect human agency enough. But if you aren't treating humans more like animals than most people are, then you're modeling humans poorly.

You are not an agenty homunculus "corrupted" by heuristics and biases. You just are heuristics and biases. And you respond to reinforcement, because most of your motivation systems still work like the motivation systems of other animals.

 

A quick reminder of what you learned in high school

  • A reinforcer is anything that, when it occurs in conjunction with an act, increases the probability that the act will occur again.
  • A positive reinforcer is something the subject wants, such as food, petting, or praise. Positive reinforcement occurs when a target behavior is followed by something the subject wants, and this increases the probability that the behavior will occur again.
  • A negative reinforcer is something the subject wants to avoid, such as a blow, a frown, or an unpleasant sound. Negative reinforcement occurs when a target behavior is followed by some relief from something the subject doesn't want, and this increases the probability that the behavior will happen again.

 

What works

  1. Small reinforcers are fine, as long as there is a strong correlation between the behavior and the reinforcer (Schneider 1973; Todorov et al. 1984). All else equal, a large reinforcer is more effective than a small one (Christopher 1988; Ludvig et al. 2007; Wolfe 1936), but the more you increase the reinforcer magnitude, the less benefit you get from the increase (Frisch & Dickinson 1990).
  2. The reinforcer should immediately follow the target behavior (Escobar & Bruner 2007; Schlinger & Blakely 1994; Schneider 1990). Pryor (2007) notes that when the reward is food, small bits (like M&Ms) are best because they can be consumed instantly instead of being consumed over an extended period of time.
  3. Any feature of a behavior can be strengthened (e.g., its intensity, frequency, rate, duration, persistence, its shape or form), so long as a reinforcer can be made contingent on that particular feature (Neuringer 2002).

 

Example applications

  • If you want someone to call you, then when they do call, don't nag them about how they never call you. Instead, be engaging and positive.
  • When trying to maintain order in a class, ignore unruly behavior and praise good behavior (Madsen et al. 1968; McNamara 1987).
  • Reward originality to encourage creativity (Pryor et al. 1969; Chambers et al. 1977Eisenberger & Armeli 1997; Eisenberger & Rhoades 2001).
  • If you want students to understand the material, don't get excited when they guess the teacher's password but instead when they demonstrate a technical understanding.
  • To help someone improve at dance or sport, ignore poor performance but reward good performance immediately, for example by shouting "Good!" (Buzas & Allyon 1981) The reason you should ignore poor performance if you say "No, you're doing it wrong!" you are inadvertently punishing the effort. A better response to a mistake would be to reinforce the effort: "Good effort! You're almost there! Try once more." 
  • Reward honesty to help people be more honest with you (Lanza et al 1982).
  • Reward opinion-expressing to get people to express their opinions more often (Verplanck 1955).
  • You may even be able to reinforce-away annoying involuntary behaviors, such as twitches (Laurenti-Lions et al. 1985) or vomiting (Wolf et al. 1965).
  • Want a young infant to learn to speak more quickly? Reinforce their attempts at vocalization (Ramely & Finkelstein 1978).
  • More training should occur via video games like DragonBox, because computer programs can easily provide instant reinforcement many times a minute for very specific behaviors (Fletcher-Flinn & Gravatt 1995).

For additional examples and studies, see The Power of Reinforcement (2004), Don't Shoot the Dog (2006), and Learning and Behavior (2008).

 

I close with Story 5, from Amy Sutherland:

For a book I was writing about a school for exotic animal trainers, I started commuting from Maine to California, where I spent my days watching students do the seemingly impossible: teaching hyenas to pirouette on command, cougars to offer their paws for a nail clipping, and baboons to skateboard.

I listened, rapt, as professional trainers explained how they taught dolphins to flip and elephants to paint. Eventually it hit me that the same techniques might work on that stubborn but lovable species, the American husband.

The central lesson I learned from exotic animal trainers is that I should reward behavior I like and ignore behavior I don't. After all, you don't get a sea lion to balance a ball on the end of its nose by nagging. The same goes for the American husband.

Back in Maine, I began thanking Scott if he threw one dirty shirt into the hamper. If he threw in two, I'd kiss him. Meanwhile, I would step over any soiled clothes on the floor without one sharp word, though I did sometimes kick them under the bed. But as he basked in my appreciation, the piles became smaller.

I was using what trainers call "approximations," rewarding the small steps toward learning a whole new behavior...

Once I started thinking this way, I couldn't stop. At the school in California, I'd be scribbling notes on how to walk an emu or have a wolf accept you as a pack member, but I'd be thinking, "I can't wait to try this on Scott."

...After two years of exotic animal training, my marriage is far smoother, my husband much easier to love.

 

Next post: Rational Romantic Relationships Part 1

Previous post: The Good News of Situationist Psychology

 

 

My thanks to Erica Edelman for doing much of the research for this post.

Comments (467)

Sort By: Controversial
Comment author: drethelin 21 June 2012 03:41:33PM 0 points [-]
Comment author: faul_sname 24 June 2012 01:45:58AM 4 points [-]

LWers do many cultish things, but I think it's safe to say that's not one of them.

Comment author: Desrtopa 24 June 2012 01:58:33AM 2 points [-]

LWers do many cultish things

How many?

Comment author: faul_sname 24 June 2012 04:44:27AM 5 points [-]

At least 3:

Specifically: foster a distrust of what outsiders say, quotes a lot of stuff by a self-appointed charismatic leader, and emphasize a single solution (rationality) for a large number of problems.

Notable also are the large number of cultish things LWers don't do, such as aggressive recruiting (or really, any recruiting at all).

Comment author: Desrtopa 24 June 2012 05:22:14AM 4 points [-]

quotes a lot of stuff by a self-appointed charismatic leader

I wouldn't exactly call Eliezer a self appointed leader. The community basically accreted around him. If he disavowed being the leader, I think we'd say he was being dishonest or fooling himself.

Not that this is a distinction from cults, the same would probably be true of most of them, I just think it's not quite accurate as a characterization.

Oh, also I think most cult leaders probably have more charisma off the internet.

Comment author: faul_sname 24 June 2012 05:31:52AM 4 points [-]

Oh, probably. I hear Luke has more real-life charisma... Though he kind of kills the "fosters a distrust of outside sources" with the amount he cites outside sources.

Comment author: wedrifid 24 June 2012 06:19:57AM *  9 points [-]

Oh, probably. I hear Luke has more real-life charisma... Though he kind of kills the "fosters a distrust of outside sources" with the amount he cites outside sources.

Quite a lot of charisma, but nothing near the level a cult leader would need to pull off a personality cult. (Although he could probably make up for this if he really wanted to by spending a few weeks reading up research on cult formation then applying it systematically as a 'how to' guide.)

Comment author: Swimmer963 24 June 2012 08:41:09AM 3 points [-]

Quite a lot of charisma, but nothing near the level a cult leader would need to pull off a personality cult. (Although he could probably make up for this if he really wanted to by spending a few weeks reading up research on cult formation then applying it systematically as a 'how to' guide.)

I would like to see Lukeprog post an article on that topic. It would be fascinating.

Comment author: wedrifid 24 June 2012 09:05:49AM 5 points [-]

I would like to see Lukeprog post an article on that topic. It would be fascinating.

Fascinating but suboptimal signalling.

Comment author: wedrifid 22 June 2012 11:59:31AM *  15 points [-]

http://en.wikipedia.org/wiki/Love_bombing

If this genuinely looks like love bombing then it could be an indication that you need more affection in your life to recalibratethe the base rate.

Comment author: sketerpot 22 June 2012 12:47:01AM *  4 points [-]

You realize that almost all people express appreciation or displeasure routinely, right? It's a normal and reasonable part of human interaction, and it's a skill that someone can try to improve without needing to feel too conflicted. Love bombing is far more extreme than anything that this post even touched on. So, while we're linking to things, here's one:

http://lesswrong.com/lw/md/cultish_countercultishness/

Comment author: Viliam_Bur 22 June 2012 10:33:55AM 5 points [-]

Love bombing is just a tool -- its morality depends on how it is used. In a typical situation it is used to ruin the person's natural resistance towards groups that exploit them; that is obviously evil.

A different thing would be to use love bombing with the person's explicit consent, as a reinforcement for things the person values, and for nothing else. Preferably for a limited time specified in advance. It could be a great tool to overcome akrasia.

Comment author: MarkusRamikin 22 June 2012 11:01:31AM 4 points [-]

love bombing with the person's explicit consent

That sounds even more creepy. I like it.

Comment author: roland 22 June 2012 10:58:11PM *  -2 points [-]

Edit: relevant quotes from the post:

When trying to maintain order in a class, ignore unruly behavior and praise good behavior (Madsen et al. 1968; McNamara 1987).

To help someone improve at dance or sport, ignore poor performance but reward good performance immediately, for example by shouting "Good!" (Buzas & Allyon 1981) The reason you should ignore poor performance if you say "No, you're doing it wrong!" you are inadvertently punishing the effort. A better response to a mistake would be to reinforce the effort: "Good effort! You're almost there! Try once more."

Reward opinion-expressing to get people to express their opinions more often

Now that we all know this, shouldn't we abolish downvotes? From my personal experience the emotional impact of a downvote is extremely frustrating and not helpful at all. The message I get from a downvote is "You are wrong!" or "What you said doesn't agree with the group consensus so we will punish you for it!". I don't see this as constructive in any sense.

Comment author: RichardKennaway 23 June 2012 07:48:41AM *  1 point [-]

The message I get from a downvote is "You are wrong!" or "What you said doesn't agree with the group consensus so we will punish you for it!".

The message I get from a downvote is "Someone did not like this." Obviously, that person is wrong. :-)

ETA: -2! Two people did not like this! I die. My brain turns into maggots which burst from my skull and multiply until they devour the world. All die. O the embarrassment.

Comment author: Jonathan_Graehl 22 June 2012 11:17:27PM 0 points [-]

I think downvotes are generally useful to other readers (though it's odd that the parent suggestion has one as I type), but I agree that people should be protected from the discouraging effect of an early, single downvote. So, why not postpone displaying the negative score to the user for long enough for possible upvotes to counter? (I don't volunteer to implement this).

Comment author: TheOtherDave 23 June 2012 01:43:11AM 4 points [-]

Be aware that some people upvote comments "back to zero" that they wouldn't otherwise upvote. (Some other people consider this bad practice.)

Comment author: TimS 22 June 2012 11:51:46PM 6 points [-]

The fact that reinforcement can be very effective in changing frequency of behavior doesn't say that punishment should never be used to change the frequency of behavior.

Reinforcement is useful for increasing frequency of behavior. When decreased frequency of behavior is desired, punishment is the type of intervention to use. (For applied behavior analysis, those are the definitions of reinforcement and punishment).

Comment author: Jonathan_Graehl 22 June 2012 11:56:21PM 2 points [-]

Sure. Although I wasn't clear about this, I had in mind the common case of a non-punishing downvoter who simply disagrees with the comment (or wants to see less of its ilk) without saying why. In case punishment is the desired effect, you're right - immediate is better.

Comment author: CharlieSheen 21 June 2012 02:36:32PM 2 points [-]

We have enough happy death spirals here.

Comment author: Eliezer_Yudkowsky 21 June 2012 03:02:03PM -1 points [-]

Whatever it is that rationalists are supposed to use instead of death spirals, we don't have enough of it until everything is funded. GO TEAM HAPPINESS!

Comment author: CharlieSheen 21 June 2012 03:14:02PM 5 points [-]

No.

Comment author: Strange7 21 June 2012 04:06:16PM 2 points [-]

How long has it been since you had a post that stabilized at net negative votes?

Comment author: gwern 21 June 2012 04:11:36PM 15 points [-]

'My Little SIAI: Positive Reinforcement is Magic'?

Comment author: wedrifid 21 June 2012 03:22:19PM 2 points [-]

We have enough happy death spirals here.

Who is happy about what?

Comment author: CharlieSheen 21 June 2012 03:28:22PM -2 points [-]

Leave sleeping mind killers lie.

Comment author: wedrifid 21 June 2012 03:35:46PM *  7 points [-]

Your unsubstantiated assertion is rejected. There is nothing that fits that label here. There are things that people like to say that everyone else is in a happy death spiral about but they are too powerfully skeptical to be one of the gullible crowd. This is useless cheap signalling that is a net detriment.

-3 M&Ms for all instances of vague self-reinforcing negativity.

Comment author: [deleted] 21 June 2012 03:53:32PM 1 point [-]

Does he have to vomit the M&M's back up?

I really hope that's not the procedure.

Comment author: CharlieSheen 21 June 2012 03:40:38PM *  4 points [-]

Very well I'll be explicit, I simply wanted to avoid a flame war. Most obvious example:

  • Relationship advice.

Now give me my M&Ms back.

Comment author: wedrifid 21 June 2012 03:54:04PM 2 points [-]

Very well I'll be explicit, I simply wanted to avoid a flame war. Most obvious example:

Relationship advice.

That isn't a Happy Death Spiral. It is a disgraceful mindkiller, sure. But it isn't remotely happy, isn't encouraged by universal reward and absence of criticism. It certainly isn't treated with or caused by the kind of positive feedback Luke's post advocates.

Now give me my M&Ms back.

You can have one back - but being fundamentally confused about what it is you are trying to criticize is only a weak mitigating factor.

Comment author: CharlieSheen 21 June 2012 03:59:08PM *  1 point [-]

That isn't a Happy Death Spiral. It is a disgraceful mindkiller, sure. But it isn't remotely happy, isn't encouraged by universal reward and absence of criticism. It certainly isn't treated with or caused by the kind of positive feedback Luke's post advocates.

Do you remember the online dating profile optimization thread? LessWrong went in Vladimir_M's words "healing crystal equivalent". That thread was a happy death spiral.

Also if you recall the critics in the relationship threads are getting tired and frustrated and just aren't showing up any more, someone even wrote out a full comment to that effect! Evaporative cooling dude. Sure we haven't had a relationship thread since Luke's part I., but its only a matter of time before someone brings it up and the critics won't be there any more.

I only bother because I'm a Charlie Sheen.

Comment author: [deleted] 21 June 2012 04:12:59PM 0 points [-]

Maybe it's because behaviorist techniques like reinforcement feel like they don't respect human agency enough. But if you aren't treating humans more like animals than most people are, then you're modeling humans poorly.

But treating human beings, especially adults, like animals is characteristically unethical. Applying some system of reinforcement where someone has asked you to effectively treat their behavior is innocuous enough, as is of course treating yourself.

But generally manipulating the behavior of other people by means other than convincing them that they should behave in a certain way seems to me to be almost definitional of a dark art. If that's not controversial, then I think this article should be qualified appropriately: never do this to other people without their explicit consent.

Comment author: TheOtherDave 21 June 2012 05:00:27PM 8 points [-]

But treating human beings, especially adults, like animals is characteristically unethical.

This statement without context is clearly incorrect; there are all sorts of behaviors we can ethically execute with respect to both humans and other animals. I understand that what you and the OP both mean to connote is particular behaviors which we restrict in typical contexts only to non-human animals, but if you're going to label them as unethical when applied to humans it helps to specify what behaviors and context those are.

manipulating the behavior of other people by means other than convincing them that they should behave in a certain way seems to me to be almost definitional of a dark art.

That's a little more specific, but not too much, as I'm not really sure what you mean by "convincing" here.

That is, if at time T1 I don't exhibit behavior B and don't assert that I should exhibit B, and you perform some act A at T2 after which I exhibit B and assert that I should exhibit B, is A an act of convincing me (and therefore OK on your account) or not (and therefore unethical on your account)? How might I test that?

never do this to other people without their explicit consent

This, on the other hand, is clear. Thank you.
I disagree with it strongly.

Comment author: Vaniver 21 June 2012 06:20:18PM 9 points [-]

But treating human beings, especially adults, like animals is characteristically unethical.

It seems to me like the flow is in the reverse direction: many unethical manipulations involve treating adults like animals. But people who skillfully use positive reinforcement are both more pleasant to be around and more effective- which seems like something ethical systems should point you towards, not away from.

Comment author: adamtpack 23 June 2012 02:13:10AM 1 point [-]

.... And here begins the debate.

What do we do? What do we think about this piece of freaking powerful magic-science?

I vote we keep it a secret. Some secrets are too dangerous and powerful to be shared.

Comment author: beoShaffer 23 June 2012 02:30:49AM 4 points [-]

I think the cat is out of the bag on this one.

Comment author: [deleted] 21 June 2012 06:28:39PM 2 points [-]

That's a fair point: I may have been treating a conditional like a bi-conditional. I think my sense of the matter is this: if a friend told me that he spent a lot of our time together thinking through ways to positively reinforce some of my behaviors, even to my benefit, I would become very suspicious of him. I would feel that I'd been treated as a child or a dog. His behavior would seem to me to be manipulative and dishonest, and I think I would feel this way even if I agreed that the results of his actions were on the whole good and good for me.

Do you think this sort of reaction on my part would be misguided? Or am I on to something?

Comment author: [deleted] 21 June 2012 07:36:11PM 3 points [-]

I think it's misguided personally. You're already being manipulated this way by your environment whether or not you realize it.

Comment author: Gastogh 21 June 2012 10:26:57AM 1 point [-]

On Skype with Eliezer, I said: "Eliezer, you've been unusually pleasant these past three weeks. I'm really happy to see that, and moreover, it increases my probability than an Eliezer-led FAI research team will work. What caused this change, do you think?"

Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."

Made me smile. Thanks for sharing.

Comment author: Viliam_Bur 21 June 2012 10:59:22AM 7 points [-]

Hopefully now that the experiment is over, they will return to the original schedule of giving M&Ms for new HPMoR chapters. Seriously, people are suffering here. :D

Comment author: hvass 21 June 2012 04:22:17AM 1 point [-]

Thanks, Luke! I've always enjoyed this sequence. (It's funny that I was tempted to include a note that I would've been happier if you contributed to the sequence more often, but let's stick with the praise for now. :-)

Comment author: philh 21 June 2012 10:12:05AM 0 points [-]

I think next time I go shopping, I'll buy a pack of M&Ms, and take one whenever I make a git commit.

Comment author: hrishimittal 22 June 2012 07:58:22AM 1 point [-]

What expert timing, Luke! Just two days ago, I came across the fascinating practice of clicker training for horses - http://www.theclickercenter.com, while reading Kathy Sierra's old blog - http://headrush.typepad.com/creating_passionate_users/2006/03/clicker_trained.html.

My only problem is that I need to train my own behaviour rather than someone else's. I'm going to try to use these techniques on myself, although I'm not sure if that's supposed to work.

Comment author: [deleted] 21 June 2012 01:13:19AM 2 points [-]

Excellent article. I wonder if reinforcement could be used to speed up rationality training? I would love to see a study done on that.

Comment author: wedrifid 21 June 2012 04:50:22PM *  7 points [-]

The central lesson I learned from exotic animal trainers is that I should reward behavior I like and ignore behavior I don't. After all, you don't get a sea lion to balance a ball on the end of its nose by nagging. The same goes for the American husband.

Back in Maine, I began thanking Scott if he threw one dirty shirt into the hamper. If he threw in two, I'd kiss him. Meanwhile, I would step over any soiled clothes on the floor without one sharp word, though I did sometimes kick them under the bed. But as he basked in my appreciation, the piles became smaller.

My wife, if pulling that kind of stunt, would quickly find that her affections were shunned and her thanks were met with clear contempt (after she was asked politely not to do that the first time). It is almost certainly not in her interests to produce a pavlovian association between her affections and attempts to control me against my wishes. My aversion to hostile takeover of internal motivations is much stronger than my desire for the affections of any particular individual.

This would be entirely different if I had made a prior agreement regarding shirts and hampers. Making it motivationally easier and more enjoyable to do things I am willing to do is to be encouraged.

Comment author: Swimmer963 22 June 2012 08:16:02PM 1 point [-]

What would you see as the difference between a) the story described, and b) a wife who kisses her husband because it makes her happy when he does helpful, nice things, of which putting laundry in the hamper is one, and her automatic response to this surge happiness is "thank you, you're an amazing man!" [kiss]? The latter includes most of the same actions on the part of the wife, and probably occurs in a lot of healthy relationships.

My aversion to hostile takeover of internal motivations is much stronger than my desire for the affections of any particular individual.

Are there some internal motivations that you are less protective of than others? For example, if someone tried to condition me to be less averse to harming people, I would have a pretty big reaction, because that particular internal motivation is sacrosanct to me. But preferences for levels of tidiness...meh. I barely consider that an internal motivation, and definitely not a facet of who I am...it's just a habit, and I don't really care about changing it in either direction.

Is the difference with you that you consider all of your motivations to be a sacrosanct part of who you are? Or just that you place a higher value on your autonomy, and being the one 100% entirely responsible for all of your decisions?

Comment author: TheOtherDave 22 June 2012 09:19:22PM 7 points [-]

It may be worth sharing, anecdotally, that years ago my husband expressed annoyance with me over the fact that I only ever rubbed his back while he was doing dishes, and it made him feel much like how wedrifid describes.

This utterly bewildered me, so I agreed to pay attention to the behavior and see what was going on. Pretty quickly it became clear to me that this was absolutely true, for reasons I wasn't entirely clear on myself, although my working theory was it was the only time that I'd regularly walk past him while he was hunched over in that particular posture, which apparently served as a "give me a backrub" signal for me, for whatever reason.

My response to this was to start giving him random backrubs at other times, which solved the problem.

My point being that (a) being annoyed by this sort of behavior is not at all unique to wedrifid, and (b) whether the behavior pattern is intentional doesn't necessarily matter very much. (I don't mean to suggest that it doesn't matter to wedrifid; actually, they have made it somewhat clear that it's part of what they're objecting to.)

Comment author: wedrifid 23 June 2012 02:45:41AM 1 point [-]

Pretty quickly it became clear to me that this was absolutely true, for reasons I wasn't entirely clear on myself,

Well, the whole thing where he is standing up against the sink with his back to you but his hands were busy and he couldn't turn around (to engage in other forms of affection) seems like the obvious guess.

Comment author: Swimmer963 22 June 2012 09:47:16PM 3 points [-]

The main lesson I'm taking from your anecdote is "people are complicated, everyone is complicated in a different way, and for almost any action or behaviour X, there will be a person somewhere who finds it awful." It's hard to guess at the relative numbers without doing a poll, but I'm guessing there's a range of people who wouldn't care if their significant other used physical affection as a reward (or who would even like it, because "yay, more total physical affection!"), and there's a range of people who would find it mildly to extremely unpleasant.

Comment author: TheOtherDave 22 June 2012 09:51:28PM 2 points [-]

I'm guessing there's a range of people who wouldn't care if their significant other used physical affection as a reward (or who would even like it, because "yay, more total physical affection!"), and there's a range of people who would find it mildly to extremely unpleasant.

Yup, that's consistent with my experience.

Comment author: pjeby 21 June 2012 05:27:30PM 6 points [-]

My wife, if pulling that kind of stunt, would quickly find that her affections were shunned and her thanks were met with clear contempt

Seriously? You'd shun your wife because she said thank you? i.e.

I began thanking Scott if he threw one dirty shirt into the hamper

Comment author: wedrifid 21 June 2012 06:21:11PM 7 points [-]

Seriously? You'd shun your wife because she said thank you?

(No, I said I would shun kisses delivered under those circumstances. No cutting and pasting of my keywords for the sake of hyperbole thanks.)

If people use their affection in a way that is obviously intended to systematically manipulate me to do things that I do not, in fact, wish to do then yes, of course those instances of affection I will shun. While I know some people are more tolerant to that kind of blatant disrespect I would expect you to at least be able to comprehend the subset of people that will not.

I'm afraid that all women who want kisses to serve the role of doggy treats within our relationship are out of luck. I have yet to experience a problem with having that policy. My model of myself predicts that rewarding hostile-to-my-interests-reward-training with increased compliance or acceptance would leave me with relationships that were far less satisfying and in particular far less enjoyment of displays of affection.

Comment author: pjeby 21 June 2012 09:04:15PM 5 points [-]

If people use their affection in a way that is obviously intended to systematically manipulate me to do things that I do not, in fact, wish to do then yes, of course those instances of affection I will shun.

Since positive reinforcement can only be applied after you already do a thing, then presumably, you at least wished to do it once. So, how is providing you with a bonus to something you've already done, manipulating you to do something you don't "wish to do"?

Comment author: wedrifid 22 June 2012 03:04:27AM 2 points [-]

Caveat: I don't know why the husband in question doesn't just put his damn clothes in the hamper. Doesn't the idea of having soiled clothes lying around repulse him anyway? Especially when sharing the space with another. I mean... ewww. But now back to assuming the target behavioral territory is not already granted by the obvious shelling point or prior arrangement.

So, how is providing you with a bonus to something you've already done, manipulating you to do something you don't "wish to do"?

It seems you wish to unilaterally accept rewarding behavior as positive. I don't. I have no trouble detecting when rewards are being used as "approximations" towards a behavioral landscape that I clearly don't want or, especially, have previously declared that I would not accept. I am also able to predict - by reference to past experience and knowledge of my own preferences - that encouraging that reward pattern gives undesired outcomes. As Vaniver mentioned, an important skill to develop is the ability to detect the difference between desired and undesired manipulations.

As a somewhat separate issue, excessive use of physical affection (kisses, hugs, sex) as a "reward" for good behavior changes the experience of those activities - and not in a good way.

Comment author: pjeby 22 June 2012 08:34:10PM 3 points [-]

Hm. You quoted a question I asked, and then proceeded to not answer it in any way. The question was:

How is providing you with a bonus to something you've already done, manipulating you to do something you don't "wish to do"?

Instead of answering that question, you supplied various generalizations whose referents in physical reality I can't ascertain. Please give an example of a situation where somebody being, say, happy that you did something, means that they are manipulating you to do something you don't "wish to do" (your previous words).

Comment author: TheOtherDave 22 June 2012 09:13:15PM 4 points [-]

Well, I'm not wedrifid, but OK.

Suppose there's a crisis at work, and in response to that crisis I step in and solve a problem.
Suppose, as part of solving that problem, I take some steps (X) that I don't enjoy doing and don't wish to do again.
Suppose my boss notices that I did X and was effective at it and decides that she wants me to do X more regularly, and being familiar with the uses of positive reinforcement decides to hand me a large bonus at our next status meeting. Further, she praises me to the skies in public for having done X, and does so in a way that communicates the (entirely accurate) message that my continuing to receive such praise is contingent on my continuing to do X.

I assert that, in this scenario, my boss is applying positive reinforcement techniques with the goal of increasing my likelihood of doing X, by providing me with a bonus to something I've already done, where X is something I don't wish to do.

Do you agree?

As to whether, in so doing, she's manipulating me... (shrug) I've already had that discussion once too often this week. If our only remaining point of disagreement about that scenario is whether the word "manipulating" properly applies to it, I'm happy to leave that point unresolved.

Comment author: pjeby 23 June 2012 01:25:41AM 0 points [-]

I assert that, in this scenario, my boss is applying positive reinforcement techniques with the goal of increasing my likelihood of doing X, by providing me with a bonus to something I've already done, where X is something I don't wish to do.

So? Are you saying this is a bad thing? That's what I'm asking wedrifid. Are you offended by said boss doing this?

Ironically, in your scenario, your boss is actually elevating your status: trying to please you in order to obtain a consent that in principle could be had by simply ordering you to do more X. So I don't think it's analagous to the situation that upsets wedrifid here.

Comment author: TheOtherDave 23 June 2012 01:37:05AM 4 points [-]

So?

So, you asked for "an example of a situation where somebody being, say, happy that you did something, means that they are manipulating you to do something you don't "wish to do"," and I gave you one.

Apparently, you also wanted an example where the person isn't also elevating my status in the process, isn't trying to please me, and isn't trying to get me to agree to something that they could order me to do. I didn't realize that, sorry.

No, I can't think of any coherent examples where someone tries to use positive reinforcement to alter my behavior by doing something that doesn't please me.

Tapping out now.

Comment author: wedrifid 23 June 2012 02:32:50AM 1 point [-]

Tapping out now.

As am I. I refer any interested observers to the previous comments by myself, TheOtherDave, Vaniver and others, as well as the details of the originally quoted example, including the emphasis on successive approximation. I expect that everyone who wishes to understand will from existing comments and that further engagement would be both futile and constitute a reward of an interaction style which is undesirable.

Comment author: NancyLebovitz 24 June 2012 03:54:53AM 1 point [-]

It depends on why TheOtherDave doesn't like doing whatever. If it's something that he could get to like or at least tolerate by being more familiar with it, no biggie.

If it's just aggravating and he doesn't get used to it, but it doesn't come up often enough to make him miserable, then it's one of those things which is apt to happen in jobs.

If it's something that takes so many additional hours that he's running himself ragged, then reinforcing him for doing it would be bad for him in the long run.

Comment author: handoflixue 22 June 2012 07:45:02PM 4 points [-]

excessive use of physical affection (kisses, hugs, sex) as a "reward" for good behavior changes the experience of those activities - and not in a good way.

Could you elaborate on that? I'm entirely okay with physical affection being used as a "reward", as long as it's also clear that the person genuinely wants affection with me, and initiates it "just because" too (actually I'd probably be entirely okay with a strictly reward-based system of affection, as long as it was explicit...)

I have no trouble detecting when rewards are being used as "approximations" towards a behavioral landscape that I clearly don't want

You seem to be assuming, in the example, that the husband doesn't WANT to be modified to put away his laundry. Is that correct?

If so, is it correct that your objection is "you're manipulating me in to a state I don't desire" rather than simply "you're manipulating me"? Given that you PERSONALLY find soiled clothes disgusting, would you PERSONALLY appreciate reinforcement that helped you overcome such a habit?

Comment author: wedrifid 23 June 2012 03:39:59AM 2 points [-]

You seem to be assuming, in the example, that the husband doesn't WANT to be modified to put away his laundry. Is that correct?

Yes.

If so, is it correct that your objection is "you're manipulating me in to a state I don't desire" rather than simply "you're manipulating me"? Given that you PERSONALLY find soiled clothes disgusting, would you PERSONALLY appreciate reinforcement that helped you overcome such a habit?

Yes.

Comment author: Vaniver 21 June 2012 06:26:39PM 10 points [-]

So, I have to ask: do you in fact have a wife?

Comment author: handoflixue 22 June 2012 07:41:55PM 6 points [-]

The phrases "of course" and "blatant disrespect" imply a shared frame of reference that doesn't seem to be in evidence. While it might be considered rude to you, it's pretty much human nature. The phrase "thank you" is, as near as I can tell, pretty much entirely meant as a positive reinforcer.

So, having established that we have different frames of reference, can you go in to WHAT behaviors bother you? Is it the use of specific actions as reinforcers ("thank you" is okay but kissing is not?) or is it just the deliberate (as opposed to socialized and subconscious) application of these techniques? Or something else that I'm missing?

Comment author: TimS 21 June 2012 06:34:20PM 3 points [-]

The question is not whether positive reinforcement is effective in changing your behavior. The question is whether kisses are positive reinforcement in particular contexts.

Suppose your spouse says, "Please pick up my prescription from the store" and you don't want to, but you do it anyway. When you get back, spouse says "Thanks for dealing with that." Do you really think continued experiences like that won't increase the frequency of the behavior "Run an errand even when I don't want to"?

Comment author: [deleted] 21 June 2012 06:40:03PM 1 point [-]

Do you really think continued experiences like that won't increase the frequency of the behavior "Run an errand even when I don't want to"?

I think it depends a lot on her intention. If she says 'thank you' for the purposes of positive reinforcement, I mean if she thinks about her 'thank you's' that way, then I think she's being manipulative.

If she says 'thank you' to say what those words mean, namely, that she's grateful, then even if this does have the effective positive reinforcement there's nothing wrong about her behavior.

Comment author: TheOtherDave 21 June 2012 06:57:05PM 14 points [-]

I find the idea of endorsing manipulative behavior if and only if I remain unaware of the fact that it's manipulative behavior deeply troubling.

It strikes me as similar to saying that hurting people is OK as long as I don't know I'm hurting them. No, it isn't. If hurting people is not OK, then it follows that I ought not hurt people, and learning to recognize when I'm hurting people is part of that, and I ought to learn to recognize it. The behavior doesn't suddenly become "not OK" the moment I learn to recognize it... it never was OK, and now I know it and can improve.

Conversely, if hurting people is OK, then it's OK whether I know I'm doing it or not.

The same goes for manipulating people. Whether I know I'm doing it or not isn't the determiner of whether I'm doing good or ill.

To my mind, the determiner of whether I'm doing good or ill is whether, when I'm done doing it, we're all better off or worse off.

Comment author: [deleted] 21 June 2012 06:58:51PM *  2 points [-]

find the idea of endorsing manipulative behavior if and only if I remain unaware of the fact that it's manipulative behavior deeply troubling.

If you don't know you're manipulating someone, you're not manipulating someone. Manipulation is an intentional behavior, like lying, or congratulating, or taking a vow. Knowing what you're doing is part of doing it.

Comment author: TheOtherDave 21 June 2012 07:06:19PM 9 points [-]

Yeah, I pretty much disagree with this statement completely.

Comment author: [deleted] 21 June 2012 07:32:23PM 1 point [-]

That's... incredible to me. Do you disagree that there is such a category (i.e. actions you have to know you're doing in order to be doing them at all), or that manipulation falls under it?

Comment author: TheOtherDave 21 June 2012 07:45:05PM 2 points [-]

I disagree that manipulation falls under it.

Comment author: TimS 21 June 2012 07:40:31PM 1 point [-]

This exchange may be helpful to understand TheOtherDave's point.

Comment author: TimS 21 June 2012 07:12:06PM 3 points [-]

I agree with your point, but I think that "manipulate" needs to be tabooed. If we define manipulate as "acts that tend to change the behavior of others" then I agree with your implicit point that it is impossible to interact with others without changing their behaviors, and there is nothing wrong with thinking about how I would like someone else to behave when considering how I interact with them.

That said, there are connotations of manipulate as the word is ordinarily used that are not captured by the way you (and I) are using the word.

Comment author: TheOtherDave 21 June 2012 07:19:32PM 2 points [-]

Sure. I'm perfectly happy to drop the word altogether and instead talk about changing the behavior of others.

Comment author: Gabriel 22 June 2012 03:41:16PM 2 points [-]

I find the idea of endorsing manipulative behavior if and only if I remain unaware of the fact that it's manipulative behavior deeply troubling.

Awareness of side effects isn't equivalent to intentionality. You can thank someone to express genuine feelings of gratitude. If you wouldn't do that in a counterfactual world in which the gratitude was absent, then I wouldn't call that behavior intentionally manipulative regardless of whether you know about positive reinforcement.

Comment author: TheOtherDave 22 June 2012 04:05:08PM *  6 points [-]

If you wouldn't do that in a counterfactual world in which the gratitude was absent, then I wouldn't call that behavior intentionally manipulative regardless of whether you know about positive reinforcement.

Suppose I am not in the habit of expressing gratitude when people do nice things for me. Never mind why... maybe I was raised wrong. For whatever reason, I'm not in that habit. I feel gratitude, certainly, I just don't express it.

Then one Monday, I learn that expressing gratitude to people for doing nice things for me will increase the odds that they will do it again. Suppose I want people to do nice things for me, and I therefore conclude that I ought to expressing gratitude when people do nice things for me, in order to get them to do it more, and I therefore start expressing gratitude when people do nice things for me, whether I feel gratitude or not.

Then on Wednesday, I learn that this only works when I genuinely do feel gratitude... when I express gratitude I don't actually feel, I get bad results. (Again, it doesn't matter why. Maybe I'm a lousy liar.) So I stop expressing gratitude when people do nice things for me when I don't feel gratitude, but I continue doing so when I do, since that still gets me stuff I want.

If I've understood you correctly, you would call me intentionally manipulative on Tuesday, but not on Thursday. I'm happy to restrict the term "intentionally manipulative" to Tuesday behavior and not Thursday behavior, if that makes communication easier, though I don't use those words that way myself.

Regardless of what words we use, presumably we agree that on both Tuesday and Thursday, I am doing something with the intention of causing changes in other people's behavior, and am doing so without their awareness or consent. Yes?

Do you endorse this on Tuesday?
Do you endorse this on Thursday?

For my own part, I find the idea of endorsing that behavior on Thursday but not on Tuesday deeply troubling, for many of the reasons I listed before.

Comment author: [deleted] 21 June 2012 06:15:21PM 12 points [-]

Some people react quite viscerally to the awareness that another party is trying intentionally to steer their behavior in any way. It seems to just be a massive squick button for some (indeed, I notice that most randomly-selected people who are made aware of explicit attempts to condition behavior react with discomfort at minimum); for others, there seems to be a correlation with triggers gained from abusive interactions earlier in life; a few I knew who reacted strongly showed strong indications of sociopathy and seemed to instinctively feel violated if someone else successfully, or even just obviously, tried to affect their behavior in a deliberate manner toward some end (a normal part of cognition and social interaction for them directed at others).

Comment author: Viliam_Bur 22 June 2012 11:06:49AM *  5 points [-]

I do accept this kind of reinforcement from my significant other, assuming that:

  • it is for a goal I agree with (extrapolated volition)
  • I am free to say "stop doing this" if I don't feel like to be reinforced; and my wish is respected
  • I do get the same signs of affection in other situations too.

Actually I consider it very useful, and for me it would be a waste not to use this kind of cheap "external willpower". YMMV.

Comment author: wedrifid 22 June 2012 11:24:58AM 1 point [-]

I do accept this kind of reinforcement from my significant other, assuming that:

Note that I consider the reinforcement you are describing to be entirely different in kind (not "this kind"). The boundaries around the kind I accept are approximately the same as yours:

  • it is for a goal I agree with (extrapolated volition)
  • I am free to say "stop doing this" if I don't feel like to be reinforced; and my wish is respected
  • I do get the same signs of affection in other situations too.

I go by what my intuition tells me but when formalizing those intuitions something similar is generated.

Actually I consider it very useful, and for me it would be a waste not to use this kind of cheap "external willpower".

I make a point of rewarding desired reinforcement (while attempting 'extinction' on less desirable influence tactics like nagging or punishment.)

Comment author: [deleted] 24 June 2012 06:31:43PM *  0 points [-]

The boundaries around the kind I accept are approximately the same as yours:

it is for a goal I agree with (extrapolated volition)

I supposed the reason why the husband in the story didn't put his clothes in the hamper was that he was too lazy to do that, not that he (terminally) valued that the clothes stayed outside the hamper.

Comment author: wedrifid 25 June 2012 06:49:54AM 0 points [-]

I supposed the reason why the husband in the story didn't put his clothes in the hamper was that he was too lazy to do that, not that he (terminally) valued that the clothes stayed outside the hamper.

Having a terminal value for clothes outside the hamper isn't the point. It is whether given the negotiated relationship boundaries and typical behaviors as they currently are the person being modified would prefer "status quo except I do <influenced behavior> more" over "status quo".

"Too lazy" can be left out of such considerations. That doesn't distinguish between akrasia and considered intent not to do the thing (for whatever reason). For most part judgements like "too lazy" are just another method of attempting influence - usually a method that is inferior to reinforcement.

Comment author: TheOtherDave 25 June 2012 01:58:40PM 1 point [-]

For most part judgements like "too lazy" are just another method of attempting influence - usually a method that is inferior to reinforcement.

Well, making judgments like "too lazy" can also provide valuable social cover for other kinds of reinforcement (or punishment), within communities where deliberately altering the behavior of others is seen as unacceptable unless I can frame it as being for their benefit.

More generally, motivated speculation about other people's best interests (including but not limited to positing that they possess unexpressed "terminal values" that happen to align better with what I seem to want than with what they seem to want) can be a very useful way to ignore people's stated preferences without feeling (or being seen by third parties as) indebted to them.

Comment author: shminux 21 June 2012 10:06:59PM *  3 points [-]

Lessons learned:

  • continue to mentally /ignore people and posts I don't care for on IRC and online forums

  • never comment on bad posts or explain my downvote on LW

  • be more generous with upvoting good contributions and give a short praise when warranted.

Comment author: Vaniver 22 June 2012 05:32:19AM 7 points [-]

never comment on bad posts or explain my downvote on LW

This is not quite justified; this is a post on how to use positive reinforcement, not how to use punishment.

Comment author: shminux 22 June 2012 06:02:39AM 1 point [-]

When a dolphin does something wrong, the trainer doesn't respond in any way.

(from the link)

Comment author: Vaniver 22 June 2012 06:30:12AM 3 points [-]

Dolphins are more difficult to punish usefully than humans; for one, they're less likely to understand English.

Moving to object-level advice: I agree that not responding to bad comments or posts is generally a good idea. I think that responding to downvote explanation requests is a good idea about half of the time. Unsolicited downvote explanation is typically done to sway bystander opinion as well as inform the poster, and so deserves its own treatment.

Comment author: tgb 23 June 2012 01:44:32PM 5 points [-]

The difference between explaining bad posts and punishing misbehaving dolphins is that the explaining is done for the purpose of the other readers, not just as a punishment.

Comment author: dbaupp 22 June 2012 06:26:41AM 3 points [-]

never comment on bad posts or explain my downvote on LW

I think this should be "never downvote".

Comment author: Nornagest 22 June 2012 10:58:03PM *  2 points [-]

I think this should be "never downvote".

Seems to me that a downvote would associate negative valance with both the act of posting on LW and with whatever their specific mistake is, with the latter being stronger. So no vote and a comment with a mixture of praise and criticism is probably the stronger play if you're looking to improve someone's writing or fix some technical mistake while keeping them as a contributor, but a downvote is still effective if all you care about is seeing fewer posts of that kind.

Comment author: wedrifid 22 June 2012 07:54:39AM -1 points [-]

never comment on bad posts or explain my downvote on LW

I think this should be "never downvote".

That would be true if the point was actually about implementing the reinforcement ideal rather than using it to validate a premeditated ideal.

Comment author: MBlume 21 June 2012 03:00:14AM 35 points [-]

Good post! Thank you for writing it Luke =)

Comment author: arundelo 21 June 2012 03:11:04AM 6 points [-]

I see what you did there!

Comment author: [deleted] 21 June 2012 08:33:42AM 0 points [-]

(I didn't until EY pointed that out.)

Comment author: FiftyTwo 26 June 2012 01:17:11AM 0 points [-]

Good on you for admitting error.

Comment author: CommanderShepard 21 June 2012 03:25:13PM *  5 points [-]

"god this is even more phygish than just that quote about eliezer getting fed mnms"

Comment author: John_Maxwell_IV 22 June 2012 01:34:58AM 2 points [-]

That strikes me as goofy, not phygish.

Comment author: Dorikka 22 June 2012 03:21:57AM 0 points [-]

I agree, so much that I think I might be missing something.

Comment author: Eliezer_Yudkowsky 21 June 2012 03:57:36AM 29 points [-]

Thanks for reinforcing Luke! And it's great that you applied the theory so quickly!

Comment author: JGWeissman 21 June 2012 04:06:19AM 20 points [-]

Yay recursive reinforcement!

Comment author: Eliezer_Yudkowsky 21 June 2012 04:19:53AM 8 points [-]

Why, thanks! It's helpful to hear you say that!

Comment author: Dorikka 21 June 2012 04:51:16AM -1 points [-]

Moar recursion! Keep it up! :D

Comment author: Will_Newsome 21 June 2012 04:56:58AM 28 points [-]

No. Unreflective happy death spirals get people killed. Shame on all of you for being bad people.

Comment author: RomeoStevens 21 June 2012 05:49:25AM 11 points [-]

I'm glad you mentioned this.

Comment author: Will_Newsome 21 June 2012 05:59:39AM *  9 points [-]

Don't be glad. If you need reinforcement, be relieved. Gladness tends to cause unreflective happy death spirals. Shame on you for being glad.

Presumably the emotion you actually felt was relief, and "glad" was merely used as an inaccurate/misleading synonym? In which case, shame on you for using inaccurate/misleading synonyms.

(I'm totally at least a quarter serious, maybe half.)

Comment author: JGWeissman 21 June 2012 06:06:54AM 12 points [-]

Thank you for wanting us to not have unreflective happy death spirals. I will have to repeat the behavior that caused you to express such caring.

Comment author: Will_Newsome 21 June 2012 06:17:24AM *  3 points [-]

I don't want you to not have unreflective happy death spirals, I'm just horrified at the potential consequences of not going out of my way to prevent you from having unreflective happy death spirals. Shame on you for imprecision and/or implicitly accusing me of hypocrisy.

Comment author: Viliam_Bur 21 June 2012 09:31:47AM 21 points [-]

I guess now it's the right time to say big thanks to everyone who didn't contribute to this thread!

Comment author: CharlieSheen 21 June 2012 02:35:41PM 24 points [-]

I think I'm going to be ill if this continues.

Comment author: [deleted] 21 June 2012 03:11:37PM 37 points [-]

"Eventually it hit me that the same techniques might work on that stubborn but lovable species, the American wife." "Back in Maine, I began thanking Amy if she threw one dirty shirt into the hamper. If she threw in two, I'd kiss her." "...After two years of exotic animal training, my marriage is far smoother, my wife much easier to love."

Comment author: ciphergoth 23 June 2012 07:29:11AM -2 points [-]

Given the many asymmetries between men and women, it seems at least plausible to me that the above would be much more problematic than the original.

Comment author: RichardKennaway 23 June 2012 07:46:30AM -2 points [-]

Sounds like standard PUA to me.

Comment author: wedrifid 23 June 2012 03:31:33PM 5 points [-]

Sounds like standard PUA to me.

Really? Exactly which PUA recommends thanking women more as a way to pick up women? That seems out of character.

There is a relation, I suppose, in as much as both are about a male influencing a female subject and both rely on principles of human or mammalian psychology. They differ in goal and (so) differ in the specific kinds of tactics.

Comment author: RichardKennaway 24 June 2012 09:19:32AM -1 points [-]

I was thinking at a higher level of abstraction. Moulding the woman's behaviour by psychological manipulation, indeed a form of "exotic animal training". This is standard doctrine in the PUA blogosphere -- see also pjeby's reply. PUA, btw, is not about picking up women.

"Psychological endocytosis" might be a better metaphor than "animal training" at the more extreme end of things.

Comment author: NancyLebovitz 25 June 2012 01:34:07PM 0 points [-]

"Psychological endocytosis"-- I don't understand the metaphor.

Comment author: RichardKennaway 25 June 2012 02:23:08PM *  -2 points [-]

Endocytosis is the process by which a cell engulfs a food particle, by extending itself around it and pulling it into its interior. Metaphorically, I am suggesting a process whereby one person similarly extends their own reality around another, undermining the other's perceptions and replacing them with their own. For example, that is what "negging" is about. It is intended to convey the message, at least in the imagination of those advocating it (fictionally imagined here), that the man's beliefs are reality and the woman's are merely pretty lies that deserve to die.

Comment author: NancyLebovitz 25 June 2012 03:21:10PM 4 points [-]

I recommend Clarisse Thorne's Confessions of a Pickup Artist Chaser, a substantial overview of the PUA communities.

PUA covers a wide range from decent behavior to just plain vile. Depending on who's talking, negging can be light-hearted teasing between people who know it's a game or a deliberate effort to keep the target off-balance and dependent on the targeter's good opinion.

It can also be an effort at light-hearted teasing which goes wrong because some PUAs just assume that beautiful women aren't nervous about how they're perceived.

Endocytosis is an interesting metaphor, and it would cover everything from total environment abusiveness (prisons, cults, some dysfunctional familes) to efforts to keep one's voice whispering in the back of a subject's mind. (Anyone have the quote about Saruman handy?)

Comment author: RichardKennaway 25 June 2012 06:47:30PM 0 points [-]

(Anyone have the quote about Saruman handy?)

"Suddenly another voice spoke, low and melodious, its very sound an enchantment. Those who listened unwarily to that voice could seldom report the words that they heard; and if they did, they wondered, for little power remained in them. Mostly they remembered only that it was a delight to hear the voice speaking, all that it said seemed wise and reasonable, and desire awoke in them by swift agreement to seem wise themselves. When others spoke they seemed harsh and uncouth by contrast; and if they gainsaid the voice, anger was kindled in the hearts of those under the spell. For some the spell lasted only while the voice spoke to them, and when it spoke to another they smiled, as men do who see through a juggler's trick while others gape at it. For many the sound of the voice alone was enough to hold them enthralled; but for those whom it conquered the spell endured when they were far away, and ever they heard that soft voice whispering and urging them. But none were unmoved; none rejected its pleas and its commands without an effort of mind and will..."

From The Two Towers, the chapter "The Voice of Saruman". The passage, btw, seems to have become a favorite of the American Right to use of Obama.

Comment author: steven0461 25 June 2012 07:47:10PM 2 points [-]

It is intended to convey the message, at least in the imagination of those advocating it,

Your link appears to point to the imagination of a critic, not the imagination of an advocate.

Comment author: RichardKennaway 26 June 2012 08:01:41AM -1 points [-]

It's the imagination of a critic imagining an advocate. I'll try and reword the link to make that clearer.

Comment author: wedrifid 24 June 2012 09:56:44AM *  0 points [-]

I was thinking at a higher level of abstraction.

Or rather, a lower standard of epistemic accuracy.

PUA skills pertain to influence by males over female behavior using methods that include operant conditioning (including reinforcement). It does not follow that all instances of influence by a male over a female using operant conditioning is standard PUA methodology. In fact this example is significantly different to the kind of application we see in standard PUA. This is unsurprising - after all, we got the example in question when Konkvistador took a wife-influencing-her-husband example and substituted roles.

see also pjeby's reply

I prefer the grandparent:

There is a relation, I suppose, in as much as both are about a male influencing a female subject and both rely on principles of human or mammalian psychology. They differ in goal and (so) differ in the specific kinds of tactics.

Comment author: NancyLebovitz 25 June 2012 03:25:22PM 4 points [-]

Your reaction to the idea of kisses to encourage a man to pick up his clothes reminds me of the way a number of women (including me) react to the idea of PUA. It's going ballistic about a hypothetical boundary violation and it's more fun in LW, where one is apparently outnumbered by people who don't see the boundary violation at all. (The boundary violation is hypothetical because the person may not have experienced it..)

Comment author: wedrifid 26 June 2012 05:13:24AM 0 points [-]

It's going ballistic about

Applying that label is both grossly inaccurate and unwelcome.

I noted that certain instances of 'influence by reward' I wouldn't accept and would respond by asking her politely to stop and then escalating as necessary to ensure that the undesired rewarding was not itself rewarded. A couple of users seemed to find the notion that someone else doesn't unconditionally accept all reinforcement offensive.

Comment author: NancyLebovitz 26 June 2012 05:53:52AM 4 points [-]

I'd say that describing small amounts of M&Ms as a significant health threat is a sign of using arguments as soldiers.

On the other hand, you've got better access to your internal experience than I do.

Comment author: wedrifid 26 June 2012 08:36:34AM 2 points [-]

I'd say that describing small amounts of M&Ms as a significant health threat is a sign of using arguments as soldiers.

This is utterly bizarre. Even allowing that you completely missed the obvious meaning of "the most significant risks are the health and dental considerations and they are so insignificant that I'm making a joke about them" my words still can't be taken to mean "there is a significant health threat to small amounts of M&Ms". Not only that but the tangent being answered, something about the relative "risk" of kisses vs M&Ms isn't something I have a position on so I have no idea which side to send 'soldiers' to. Neither of those things are at all 'risky'. It pretty much comes down to "rotten teeth and diabetes vs spreading infectious mononucleosis and herpes simplex" - both at insignificant probabilities and I don't care either way.

On the other hand, you've got better access to your internal experience than I do.

Access to internal experience isn't required to dismiss your accusations. Non-motivated reading of my actual words is.

If I was going to "go ballistic" about anything it would be the active misrepresentation of my words and actions by yourself and pjeby. Not only have you been allowed to get away with slander without sanction you have been actually rewarded for it. I am disgusted.

Comment author: NancyLebovitz 26 June 2012 01:46:54PM 3 points [-]

Sorry for not getting that you intended to make a joke-- I've found that, even in real life and more so online, hyperbolic humor and reduction to absurdity are risky strategies. People are apt to not get the context, or to not agree on what's absurd.

I hadn't gotten around to asking why I was getting upvotes on my previous comments in this thread. It's possible that people agreed with my take what you said, but it's also possible that they mostly found the prospect of a quarrel entertaining. (They presumably agreed with me to some extent, or we'd both be getting upvotes.)

Part of my reason for saying "ballistic" is that I don't think most people would consider a policy of kisses for putting clothes in the hamper to be such a serious infringement that if it isn't stopped after one request, it's a good reason for divorce.

My aversion to hostile takeover of internal motivations is much stronger than my desire for the affections of any particular individual.

I admit I missed this sentence on previous readings, and it's probably at the center of your objections. I do think "hostile" is extreme, but maybe I'm missing something.

I think there's a middle range between benign efforts at improvement and hostility-- the range where the person is fairly indifferent to the attempted behavior change. I'm guessing that it's the lack of respect for conscious choice by the person being reinforced which causes you to frame it as hostile.

Comment author: wedrifid 27 June 2012 02:30:03PM *  1 point [-]

Part of my reason for saying "ballistic" is that I don't think most people would consider a policy of kisses for putting clothes in the hamper to be such a serious infringement that if it isn't stopped after one request, it's a good reason for divorce.

That position sounds bizarre, I don't think it exists outside of pjeby's straw man. I believe my stated response was to shun the kisses.

As it happens I've never even had to escalate to the "ask politely" level. A smirk, a knowing look and a "Really?" avoided the conflict while keeping the interaction at the level of play, while still communicating the presence of a boundary.

I think there's a middle range between benign efforts at improvement and hostility-- the range where the person is fairly indifferent to the attempted behavior change. I'm guessing that it's the lack of respect for conscious choice by the person being reinforced which causes you to frame it as hostile.

Yes.

Comment author: TheOtherDave 26 June 2012 02:21:58PM 2 points [-]

even in real life and more so online, hyperbolic humor and reduction to absurdity are risky strategies. People are apt to not get the context, or to not agree on what's absurd.

This is true.

I've also found, especially online, that characterizing the emotional states of my interlocutors for them is a risky strategy. On those rare occasions where the other person's emotional state really is important, I find I do better to explicitly ask for confirmation of my perception about it, rather than implying or referring to it as an observed fact.

Comment author: pjeby 24 June 2012 12:43:19AM 11 points [-]

Really? Exactly which PUA recommends thanking women more as a way to pick up women? That seems out of character.

Quite a few PUA schools advise ignoring behavior you don't like, and rewarding behavior you do like, as well as ensuring that you aren't inadvertently sending out a lot of positive reinforcement just because someone is attractive.

True, "thank you" is not generally a recommended form of reinforcement; non-verbal reinforcements like smiles, nods, touch, laughter, looking interested, turning towards the person, etc. are more generally recommended. Occasionally, a certain old story is cited: the one about the professor whose class conditioned him to stop pacing back and forth by looking interested only when he was in the middle of the room.

Comment author: TheOtherDave 23 June 2012 04:22:02PM 2 points [-]

both rely on principles of human or mammalian psychology

Operant conditioning works pretty much the same way on some non-mammals as well.

Comment author: wedrifid 23 June 2012 04:37:21PM *  1 point [-]

Operant conditioning works pretty much the same way on some non-mammals as well.

Yes, it's the PUA tactics that are in general more mammal specific (at least).

Comment author: wedrifid 23 June 2012 03:32:48PM 9 points [-]

Given the many asymmetries between men and women, it seems at least plausible to me that the above would be much more problematic than the original.

It also seems plausible that the reverse is true. Or neither.

Comment author: TheOtherDave 23 June 2012 04:19:47PM 1 point [-]

Or, most likely of all, that it depends on the relative salience at any given moment of the large set of factors that "problematic" aggregates.

Comment author: lukeprog 22 June 2012 01:31:22AM *  8 points [-]

Have some tact, man. My post was fine, but you... you are a god damned sexist.

Comment author: handoflixue 22 June 2012 06:57:41PM 11 points [-]

This actually bothers me less than the original, simply because the stereotype of "properly raised wife having to train her lower-status husband to act appropriately" is a VERY common social meme, whereas "husband training wife" is something I generally only see in the context of physical abuse (which, given the lack of violence, this obviously isn't).

Is there a cultural meme I'm missing here that makes THIS version the more offensive one? o.o

Comment author: Raemon 22 June 2012 07:14:16PM *  22 points [-]

"Woman Training Man" is generally presented as funny with no negative ramifications. "Husband training wife" is presented in the context of either physical abuse, emotional abuse, or as part of a widespread societal trend of women being "domesticated" which is now generally considered distasteful. If this had been phrased "husband training wife", it wouldn't pattern match to "funny, harmless joke", it'd pattern-match to either abuse or societal oppression. (The abuse angle wouldn't necessarily be accurate, but for many people it would come to mind before the "mirror-image-of-the-woman-training-man" concept did).

So whether it actually makes sense, the example would produce negative affect in many people.

Comment author: TheOtherDave 22 June 2012 07:13:50PM 2 points [-]

No, it sounds like you're aware of the relevant cultural meme.

Comment author: Viliam_Bur 22 June 2012 08:11:03PM *  5 points [-]

"wife training lower-status husband" is a cultural meme

"man abusing woman" is a very strong meme, and "man <something> woman" pattern-matches it

Comment author: private_messaging 23 June 2012 12:30:38PM *  -1 points [-]

man abusing woman is not only a very strong "meme", but also a common occurrence due to biological detail of males in mammals generally a: being larger b: being more aggressive and c: likely being naturally more selfish (due to different reproductive role). edit: all I am saying is that there is a biologically justified prior here, that most people use, a body of utterly indisputable evidence across many species of mammals. Except subpar evidence-evaluators, of course, whom do not process the prior and are also subject to Dunning-Kruger effect about it.

Comment author: [deleted] 25 June 2012 06:58:49PM 1 point [-]

Why the hell was that downvoted? I guess it was supposed to be a descriptive statement but people misunderstood it as a normative one.

Comment author: private_messaging 26 June 2012 08:49:54AM *  -1 points [-]

At least 2 people seem to think you guess wrong.

edit: as of how i interpret reactions to such statements, i have already an explanation for e.g. gaming forums where we have very similar white privileged male nerd demographics. We don't do downvoting there because enabling downvotes lets the white privileged male nerd majority enforce their worldviews and discourage any dissent, which we can not afford because we make games for everyone not just the white privileged male nerd majority. Tho its up to -1 here.

Comment author: zslastman 18 July 2013 01:29:54PM -1 points [-]

The edit is worthy of a downvote, the original part an upvote.

Comment author: TheOtherDave 22 June 2012 08:57:30PM 1 point [-]

I agree with all of those statements, and am left with the sense that you were trying to convey an additional message that I didn't quite get.

Comment author: Viliam_Bur 23 June 2012 10:08:47AM *  4 points [-]

Just an observation of sexism in our society. We are hypersensitive about anything negative that happens to women (it is a great opportunity for signalling moral superiority above people who are not outraged), while misfortunes of low-status males are just funny (signalling care about them is low-status).

How exactly does this happen? How exactly appears the paradox that this unequal reaction is percieved as fair, while complaining about it can be so easily labeled as sexist?

There is an obvious evolutionary explanation (low-status males are expendable, there is no advantage for high-status males or any-status females to care about them), but how does the algorithm feel from inside? First, there is a rationalization that problems of low-status males are either not real, or could (and should) be easily avoided by them, so if they don't avoid the situations, they obviously deserve the consequences. (Unless they are members of some minority, in which case it is OK to express moral outrage about the opression of given minority.) Second, we are hyper-sensitivised by feminism about everything related to women, because even the smallest joke means that you are a supporter of patriarchy and rape culture, which makes you a complice in every abuse and murder and whatever. There are no innocent jokes about women. Saying your wife "thank you" for doing something nice for you is just a first step on a slippery slope of evil male behavior. (And no, there is no female privilege, and if you have a misunderstood word, go read feminism 101 until you accept it.)

There. Sorry for the mindkilling, I don't know how to write it better without spending too big part of a weekend online.

EDIT: related video

Comment author: [deleted] 25 June 2012 06:53:03PM *  12 points [-]

And no, there is no female privilege, and if you have a misunderstood word, go read feminism 101 until you accept it.

I seem to recall having seen at least one introduction to feminism which did acknowledge that there are forms of female privilege (e.g. children usually end up with the mother after divorces), even though far fewer than forms of male privilege (their list was about an order of magnitude shorter). (This made me find that introduction much more credible, as otherwise it would have failed Policy Debates Should Not Appear One-Sided.)

Comment author: Viliam_Bur 25 June 2012 07:29:57PM 2 points [-]

I would have more respect for such introduction, too, for pretty much the same reasons.

Comment author: TheOtherDave 23 June 2012 02:46:18PM 1 point [-]

OK. Thanks for being explicit.

Comment author: Kaj_Sotala 26 June 2012 11:53:24AM 17 points [-]

It's probably worth noting that the original article, which lukeprog quoted, ended with this:

PROFESSIONALS talk of animals that understand training so well they eventually use it back on the trainer. My animal did the same. When the training techniques worked so beautifully, I couldn't resist telling my husband what I was up to. He wasn't offended, just amused. As I explained the techniques and terminology, he soaked it up. Far more than I realized.

Last fall, firmly in middle age, I learned that I needed braces. They were not only humiliating, but also excruciating. For weeks my gums, teeth, jaw and sinuses throbbed. I complained frequently and loudly. Scott assured me that I would become used to all the metal in my mouth. I did not.

One morning, as I launched into yet another tirade about how uncomfortable I was, Scott just looked at me blankly. He didn't say a word or acknowledge my rant in any way, not even with a nod.

I quickly ran out of steam and started to walk away. Then I realized what was happening, and I turned and asked, "Are you giving me an L. R. S.?" Silence. "You are, aren't you?"

He finally smiled, but his L. R. S. has already done the trick. He'd begun to train me, the American wife.

Comment author: tgb 21 June 2012 02:42:27AM 21 points [-]

I like this article because it is reasonably short, but very clear and highly actionable.

Comment author: sketerpot 21 June 2012 11:21:58PM 13 points [-]

This compliment is particularly effective because it's specific, verifiable, and true. I've never been very good at accepting vague compliments -- I tend to get embarrassed and self-conscious -- but more specific compliments are really nice.

Comment author: XFrequentist 25 June 2012 05:42:34PM 1 point [-]

This explanation of why the complementary comment on the article was effective is itself effective, because it gives specific reasons why the complement is unlikely to evoke the embarrassment sometimes associated with more vague complements.

Comment author: Will_Newsome 21 June 2012 04:51:33AM 16 points [-]

Eagerly awaiting "The Power of Punishment".

Comment author: wedrifid 21 June 2012 05:10:45AM 8 points [-]

Eagerly awaiting "The Power of Punishment".

Particularly good for demonstrating to observers that you have more status and power than the person you are punishing.

Comment author: Will_Newsome 21 June 2012 05:14:54AM 1 point [-]

(demonstrating to observers / demonstrating to self / demonstrating to punished; status / power / resources / justification / need / etc; person / cognitive subsystem / institution / problem representation / etc)

Comment author: Viliam_Bur 21 June 2012 10:04:17AM *  6 points [-]

meh. downvoted.

(just joking)

Comment author: JulianMorrison 21 June 2012 09:25:43AM 12 points [-]

Anecdotally, punishment seems to be a good guilt-releaser, while guilt is dysthymic. Punishment may be effective at snapping someone out of a blue funk and getting them to be responsive to rewards. Guilty people reject rewards. (The above may work better if you are kinked that way.)

Comment author: RichardKennaway 26 June 2012 09:19:56AM 1 point [-]

I'm curious about the anecdotes. I feel like I'm reading travellers' tales of the weird customs of a distant tribe.

Comment author: JulianMorrison 26 June 2012 08:27:24PM -1 points [-]

How about I direct you to this blog for a gentle introduction?

Comment author: arundelo 26 June 2012 09:51:01PM 1 point [-]

It's guessable from context, but an NSFW tag would probably be good here.

Comment author: Swimmer963 21 June 2012 01:19:27AM *  11 points [-]

To help someone improve at dance or sport, ignore poor performance but reward good performance immediately, for example by shouting "Good!" (Buzas & Allyon 1981) The reason you should ignore poor performance if you say "No, you're doing it wrong!" you are inadvertently punishing the effort. A better response to a mistake would be to reinforce the effort: "Good effort! You're almost there! Try once more."

I got a demonstration of how true this is yesterday when, during my taekwondo class, I was paired up with one of the senior black belt students, who has some but not a lot of experience teaching. He was supposed to be fixing up my poomsae (same thing as a kata in karate) and each time he watched me do it, I would finish and he would immediately launch into a description of what I was doing wrong. His feedback was pretty useful–specific, with demonstrations of exactly what to change in order to do it right–but without any prelude of "yay, good job!" or even "okay, the punches were way better that time...now let's work on the stances", I found myself getting really discouraged. Reminding myself that I wasn't actually doing worse than usual, that he just had a different teaching style, helped a little... But my subconscious brain still decided to feel resentful and unenthusiastic, no matter how counterproductive that might be towards my actual goal of improving my poomsae.

As a swimming instructor, I do make sure to dole out a LOT of praise, but I'm wondering if I should push it even further...

Comment author: TheOtherDave 21 June 2012 01:36:18AM 6 points [-]

"Don't Shoot the Dog" remains my favorite book for these sorts of anecdotes, as well as some of the theory and a lot of the practice. I recommend it.

Comment author: mapnoterritory 21 June 2012 07:30:51PM *  9 points [-]

Daniel Kahneman in Thinking, Fast and Slow:

I had stumbled onto a significant fact of the human condition: the feedback to which life exposes us is perverse. Because we tend to be nice to other people when they please us and nasty when they do not, we are statistically punished for being nice and rewarded for being nasty.

There reason for that lies in regression to the mean when training (example of flight instructors in the israel airforce):

I pointed out to the instructors that what they saw on the board coincided with what we had heard about the performance of aerobatic maneuvers on successive attempts: poor performance was typically followed by improvement and good performance by deterioration, without any help from either praise or punishment.

Since positive reinforcement is so counterintuitive: don't forget to reward yourself for rewarding somebody for good behaviour! :)

Comment author: Eugine_Nier 22 June 2012 04:44:33AM 0 points [-]

I had stumbled onto a significant fact of the human condition: the feedback to which life exposes us is perverse. Because we tend to be nice to other people when they please us and nasty when they do not, we are statistically punished for being nice and rewarded for being nasty.

So you (or at least Kahneman) implicitly admit that punishment is effective at changing behavior.

Comment author: mapnoterritory 22 June 2012 06:48:19AM *  1 point [-]

Yes, I think so and apparently so does Kahneman. I don't think this is particularly controversial. Kahneman does say that positive reinforcement is more efficient (both in animals and humans).

Comment author: Vaniver 22 June 2012 05:33:55AM *  1 point [-]

Everyone who's looked at the data thinks that punishment can change behavior. The question is whether punishment makes the changes you want- and people dramatically overestimate the usefulness of punishment and dramatically underestimate the usefulness of positive reinforcement.

Comment author: Eugine_Nier 23 June 2012 06:18:53AM 1 point [-]

Depends, the current "everyone is special, everyone deserves an A for trying" culture almost certainly overvalues positive reinforcement.

Comment author: pnrjulius 05 July 2012 01:26:03AM 0 points [-]

Everyone getting an A isn't reinforcement. Reinforcement has to be conditional on something. If you give everyone who writes a long paper an A, that's reinforcing writing long papers. If you give everyone who writes a well-written paper an A, that's reinforcing well-written papers (and probably more what you want to do).

But if you just give everyone an A, that may be positive, but it simply isn't reinforcement.

Comment author: Vaniver 23 June 2012 03:26:39PM 1 point [-]

I see a difference between 'niceness' and 'positive reinforcement'. The "everyone deserves an A for trying" approach is 'nice' but it generally isn't skillful positive reinforcement; I think a major problem with it is underestimating how much it rewards behaviors that look like trying but aren't trying.

There's also a basic value question- if you're trying to build self-esteem, it's not clear that an "A for trying" approach overvalues positive reinforcement, though if you're trying to build understanding, it clearly would be a misapplication of positive reinforcement.

Comment author: Viliam_Bur 22 June 2012 10:23:10AM *  4 points [-]

The question is whether punishment makes the changes you want

Also it depends on the definition of what you "want" -- for example if you punish someone for bad behavior, what exactly is your goal?

  • to help them improve their behavior?
  • to signal to other people that you care?
  • to have higher status that the punished person?

All three goals are pleasant, though only the first one is officially desirable. The punishment works in all directions. Perhaps this is the reason why behavior change by punishment is popular more than it deserves; and why people rationalize its usefulness even when the first goal visibly fails.

Comment author: Vaniver 22 June 2012 04:09:11PM 1 point [-]

Agreed. Hopefully, instructors care most about the first- but in general human interaction, the others can easily rise to prominence.

Comment author: faul_sname 22 June 2012 01:47:15AM 9 points [-]

Speaking of regression to the mean, that seems to be one topic that wasn't really covered in the sequences that really should have been.

Comment author: coffeespoons 22 June 2012 10:32:33AM 11 points [-]

I read this post last night. I was in the office late, not because I had a great deal to do, but because I was procrastinating. After reading it, I asked my friend to give me a quick call to say congratulations in a half an hour if I'd finished all the work. It took me 10 minutes to finish! :)

Comment author: Jonathan_Graehl 22 June 2012 11:18:15PM 8 points [-]

But that's probably more of a public commitment effect.

Comment author: Normal_Anomaly 23 June 2012 12:36:57AM 6 points [-]

True. But I bet if coffeespoons makes this a routine thing, they'll eventually find themselves enjoying work more.

Comment author: Vladimir_Golovin 21 June 2012 09:44:46AM 11 points [-]
  1. Nice post SIAI! Have an $5 donation!

  2. I tried a similar reinforcement technique on myself but it didn't stick because I couldn't find a reliable trigger condition for dispensing the reward.

  3. Does this mean that we should stop punishing ourselves for procrastination?

Comment author: Kaj_Sotala 21 June 2012 11:14:47AM *  14 points [-]

Does this mean that we should stop punishing ourselves for procrastination?

My personal experience strongly suggests that "stop punishing yourself for X" helps avoid X, for most if not all X. For instance, becoming a vegetarian was much easier when I didn't try to go cold turkey, but rather was fine with the fact that I would succumb to the lure of eating meat every now and then. When I did, I felt a little guilty, but then shrugged and thought that I'd try better the next time. I still fall victim to that temptation occasionally, but it's much more rare now than it used to be.

This might have something to do with the fact that if you punish yourself for trying and failing, you stop wanting to try in the first place, as it becomes associated with the negative emotions. Also, accepting and being okay with the occasional failure makes you treat it as a genuine choice where you have agency, not something that you're forced to do against your will.

See also It's okay to be (at least a little) irrational.

Comment author: JGWeissman 21 June 2012 01:58:56AM 11 points [-]

On Skype with Eliezer, I said: "Eliezer, you've been unusually pleasant these past three weeks. I'm really happy to see that, and moreover, it increases my probability than an Eliezer-led FAI research team will work. What caused this change, do you think?"

Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."

If I recall my high school psychology class correctly, you can get a stronger and more persistent effect by secretly rolling a dice and note the number, and when Eliezer says that many nice things, give him an M&M, roll the dice again for a new target number of nice things.

Comment author: ciphergoth 21 June 2012 06:18:23AM 5 points [-]

When the threshold is "something nice", there's going to be randomness in the reinforcement anyway.

Comment author: Rain 21 June 2012 03:13:15AM *  12 points [-]

That's why I tried to stay positive when talking about the new SI website. Especially with technical changes like that, the (vocal) negative response can be overwhelming.

Comment author: johnlawrenceaspden 21 June 2012 03:45:25PM 20 points [-]

Thank you Luke for this beautifully written post.

A while ago I saw a kindly waitress give my friend's two year old daughter a small cookie in a restaurant. Various emotions flickered across her tiny face, and then she made a decision, accompanied by a small smile.

She broke the cookie into three pieces and gave them to her brothers. Completely unprompted.

I couldn't believe my eyes. I asked my friend, who is a lecturer in experimental psychology, whether altruism was normal amongst very young siblings.

He looked a bit smug and said "Well we put a lot of reinforcement into that."

I hadn't really thought about what that meant until now. Your clear writing has made it obvious.

As a result of your post, I think I'm going to try deliberately modifying some of my own behaviours this way, and maybe try the techniques on some friends. (The first time, by the way, that I've changed my behaviour as a result of reading less wrong, rather than just treating it as philosophical crack.)

For friends it seems that sincere praise / avoiding criticism would be good, but what would you recommend as rewards to self? I'm pretty sure that nicotine and pizza slices would work for me, but I'm also sure that those aren't things I want to do more of.

Comment author: TheOtherDave 22 June 2012 12:33:49PM 8 points [-]

Don't underestimate the power of praise as a self-reward. It feels really goofy to explicitly praise myself -- especially to do it out loud -- but that doesn't mean it doesn't work.

IME, the biggest problem with self-reward, whatever the mechanism, is that it requires quite a lot of discipline to differentially reward the thing I want to reinforce at all consistently.

The only time I ever really maintained that discipline for any length of time was when I was recovering from brain damage, when continued focus on self-improvement was the single most important thing in my life for about 18 months. In my real life, I just don't care that much. YMMV.

Recruiting allies to reward me works better for me.

Comment author: Viliam_Bur 22 June 2012 10:16:30AM *  8 points [-]

For friends it seems that sincere praise / avoiding criticism would be good, but what would you recommend as rewards to self? I'm pretty sure that nicotine and pizza slices would work for me, but I'm also sure that those aren't things I want to do more of.

M&Ms, one piece at a time -- they are small enough. (It would probably be good if you stop eating them in all other circumstances, but that is not big sacrifice.)

Or try a symbolic reward. For example put on your table two glass boxes, put 100 stones in first one, and every time you want to reward yourself, move one stone from the first box to the second one, and congratulate yourself on progress. When all stones are in the second box, give yourself a big reward (pizza or whatever), change the boxes, and start again. (This way the reward is still linked to pizza, but it is less pizza. And you see your progress all the time.)

Comment author: lukeprog 21 June 2012 03:35:23AM 23 points [-]

Reason #228 I'm crazy and irrational: Without conscious attention to the reinforcement process, my behaviors are selected for reinforcement almost at random. The process selecting behaviors for reinforcement has tons of steps in it like "Did I happen to glance in the direction of the bag of M&Ms right now?" instead of "Is the thing I'm doing now something I want to reinforce?"

Comment author: Caspian 18 July 2013 01:04:50PM 1 point [-]

I just read Don't Shoot The Dog, and one of the interesting bits was that it seemed like getting trained the way it described was fun for the animals, like a good game. Also as the skill was learnt the task difficulty level was raised so it wasn't too easy. And the rewards seemed somewhat symbolic - a clicker, and being fed with food that wasn't officially restricted outside the training sessions.

Thinking about applying it to myself, having the reward not be too important outside the game/practise means I'm not likely to want to bypass the game to get the reward directly. Having the system be fun means it's improving my quality of life in that way in addition to any behaviour change.

I haven't done much about ramping up the challenge. How does one make doing the dishes more challenging?

But I did make sure to make the rewards quicker/more frequent by rewarding subtasks.

Comment author: tsakinis 28 September 2012 12:55:14AM 1 point [-]

Wow, thanks for this great article that was the final piece of information that tipped me over towards getting my shit together. Within 10 minutes after reading it and browsing the comments, I was on my bicycle going to buy small treats I like, that I now give myself for every achieved small goal (~2-10 min of work).

I now wonder though if maybe I should give myself another reinforcer when starting to work with a new goal, otherwise maybe I will only strive for finishing as fast as possible, but starting with a new small goal won't be that much reinforced? Maybe this is my mind trying to get more candy though, so I would be thankful for outside perspective.

Comment author: lukeprog 27 January 2013 05:22:26AM 0 points [-]

Have you been trying this? Any luck?

Comment author: tsakinis 06 March 2013 09:57:22AM 1 point [-]

It worked with similar effectiveness as other techniques I implement - that means only until I have done enough to feel good about myself (2-5 productive days)...

Comment author: [deleted] 05 July 2012 05:47:22PM 4 points [-]

So, reinforcement with M&Ms doesn't translate into an addiction for extrinsic rewards and the reduction of intrinsic motivation?

I'm missing something here, I know.

Comment author: [deleted] 26 June 2012 07:10:24PM *  12 points [-]

The lead article conflates two process: habits and incentives. The very term "reinforcement" dates back to before the distinction was well-understood. Only in the last decade has it been known that habit operates from a neurology distinct from incentives. (The habit mechanism is in a much older part of the brain.) Only the first story, Yudkowsky and the jellybeans, deals clearly with reinforcement of habit. The others are probably primarily adjustment of incentives.

In using habit and incentive, different rules apply. Incentives require that the subject discern the contingency. The processes Skinner studied as "reinforcement" are mostly about incentives. You adjust schedules of reinforcement to alter the organism's expectancies. For incentive effects, consistent reinforcement is not usually best, as the results are subject to extinction soon after the organism stops getting the reward.

Habits, on the other hand, are blind. The organism doesn't need to see any contingency. Yudkowsky continued to be nice even after he no longer received the jellybeans. To form habits, as opposed to incentive structures, consistency is key.

In short, as a general rule, you want consistency to reward habits and considerable randomness to create lasting incentives.

But the difference extends also to the ethical questions raised. Altering others' incentives for our own benefit is part of ordinary human interaction. If his colleagues surreptitiously timed the offer of jellybeans to Yudkowsky when he acted nice, this is something else; the ethical reason is that Yudkowsky need not recognize what he's being rewarded for to be affected by the jellybeans.

Both habit and incentive are "powerful." But they're powerful for different reasons, in different ways; and to apply them effectively and ethically requires different procedures.

Comment author: Pablo_Stafforini 19 October 2012 12:21:03AM 1 point [-]

Can anyone here point me to the relevant scholarly literature discussing the differences between habits and incentives? I tried Google and Google Scholar but failed to find any paper or survey article that explicitly contrasts these two processes.

Comment author: mwengler 03 July 2012 09:52:14PM 4 points [-]

How do you tell which things you want to reinforce are habits (and should therefore be reinforced consistently) and which things are incentives?

Comment author: bbleeker 19 February 2013 11:22:12AM 1 point [-]

I'd think a habit is something that just goes on as long as nothing happens to disrupt it. You no longer need to reinforce it.

Comment author: EphemeralNight 21 June 2012 09:43:29PM 4 points [-]

The reason you should ignore poor performance if you say "No, you're doing it wrong!" you are inadvertently punishing the effort. A better response to a mistake would be to reinforce the effort: "Good effort! You're almost there! Try once more.

I am probably unusual in this regard, but I think I would find both approaches equally aggravating. If someone points out that I've made a mistake, anything other than a concise detailing of exactly how what I did differs from what I was supposed to do, is just going to irritate me. Also, my brain tends to interpret being ignored as a signal that I'm doing correctly.

Comment author: pnrjulius 05 July 2012 01:28:59AM -1 points [-]

I've always found that recommendations of what to do are much more useful than any kind of praise, reward, punishment, or criticism.

On the other hand, if everyone told you how to do everything, you might never learn the very important skill of teaching yourself to do things.

Comment author: Swimmer963 21 June 2012 09:56:04PM 2 points [-]

If someone points out that I've made a mistake, anything other than a concise detailing of exactly how what I did differs from what I was supposed to do, is just going to irritate me.

Is this because of the "damn it, I know I made a mistake, you telling me I did doesn't help!" effect? I get that too... A good thought experiment is that if I was making a type of mistake that I couldn't automatically tell I was making on my own, I would prefer it to be pointed out, even if not in a concise detailed fashion–the idea of not knowing that I'm making a mistake is kind of scary. What would your reaction be in that situation?

Comment author: EphemeralNight 21 June 2012 10:23:58PM *  2 points [-]

Is this because of the "damn it, I know I made a mistake, you telling me I did doesn't help!" effect?

No, I react the same way whether I was previously aware of my mistake or not. I only experience that effect when I'm told to do something I am already doing.

A good thought experiment is that if I was making a type of mistake that I couldn't automatically tell I was making on my own, I would prefer it to be pointed out, even if not in a concise detailed fashion–the idea of not knowing that I'm making a mistake is kind of scary. What would your reaction be in that situation?

Pragmatically, we as humans, just barely over the threshold into sapient intelligence, make mistakes we're not aware of constantly. If we didn't, we wouldn't need a superintelligence to fix the world; we'd have already done it ourselves. So finding the concept scary seems kind of pointless.(Sort of like being hydrophobic about the water in one's own body.) However, I would, of course, rather be aware of my mistakes than not.

But none of this is really on the topic, which was that the listed reinforcements don't seem even remotely applicable to humans in a universal way.

Comment author: Swimmer963 22 June 2012 02:16:26AM 2 points [-]

So finding the concept scary seems kind of pointless. However, I would, of course, rather be aware of my mistakes than not.

My actions have impacts on others. In general, I prefer to help other people or at least not harm them–however, I may harm someone by mistake, and I really don't want this to happen. If I make a mistake once and I realize it–fine, hopefully no harm done, I won't do it again. If I make a mistake and I don't know about it, well, maybe no harm done that time in particular, but I'm likely to keep making this mistake over and over, and possibly the first time I'll find out is when there is harm done. I think that justifies finding it scary.