Bit of a tangent, but if you ever run across someone for whom this doesn't seem to work, check the hypothesis that they don't parse praise as a positive reinforcer. I don't know how common this is, but I actually have to make a conscious effort to keep it from acting as a mild punishment in most cases when it's applied to me. (Ditto M&Ms in the given context, I expect. Attention Bad.)
You are correct that there are many kinds of reinforcers, and it's important to make sure that the one you choose to use is something the receiver will desire.
"In other studies, animals and people given a choice between performing a task for either of two reinforcers often show strong preferences (Parsons & Reid, 1990; Simmons, 1924). Identifying preferred reinforcers can improve the effectiveness of a reinforcement procedure in applied settings (Mace et al., 1997).”
-Learning and Behavior, p149
Furthermore at least one person I know (er, myself) picks up on any sort of test-like or game-like or we're-judging-you-so-you-better-not-screw-up-like context and starts acting in extremely confusing/uninformative/atypical/misleading ways so as not to be seen as the kind of person who is easily manipulable (there are probably other motivations involved too). Any incentive structure I'm put under thus has to somehow take this into account, even e.g. the LessWrong karma system. Explicitly manipulative socially mediated praise/M&Ms would strike my brain as outright evil and would stand some chance of being inverted entirely. That said I don't get the impression this sort of defense mechanism is very common.
Most people want to be sincerely praised. Someone who reads this post and applies it poorly is going to be saying praise while their body language says something else entirely. Or acting out of character for themselves, leading the reinforcee to suspect that the praise is insincere. Or they may go around praising seemingly everything, causing the reinforcee to interpret the praise as meaningless noise.
There are lots of ways for using praise as reinforcement to go wrong, and if someone is in one of those environments for long enough they will end up being conditioned to interpret praise as neutral or negative.
I suspect it is common enough that when you observe that praising someone doesn't reinforce their behavior or makes them uncomfortable, you should consider that they might have an unusual aversion to praise.
And also, that you might just be really bad at it. ;-)
This was my problem for quite a while: believing that I ought to praise people, while alieving that there wasn't anything to praise and that they didn't deserve it, due to all their obvious imperfections.
This, as you can imagine, produced sub-optimal results. ;-)
No. Unreflective happy death spirals get people killed. Shame on all of you for being bad people.
Don't be glad. If you need reinforcement, be relieved. Gladness tends to cause unreflective happy death spirals. Shame on you for being glad.
Presumably the emotion you actually felt was relief, and "glad" was merely used as an inaccurate/misleading synonym? In which case, shame on you for using inaccurate/misleading synonyms.
(I'm totally at least a quarter serious, maybe half.)
Thank you for wanting us to not have unreflective happy death spirals. I will have to repeat the behavior that caused you to express such caring.
I guess now it's the right time to say big thanks to everyone who didn't contribute to this thread!
"Eventually it hit me that the same techniques might work on that stubborn but lovable species, the American wife." "Back in Maine, I began thanking Amy if she threw one dirty shirt into the hamper. If she threw in two, I'd kiss her." "...After two years of exotic animal training, my marriage is far smoother, my wife much easier to love."
It's probably worth noting that the original article, which lukeprog quoted, ended with this:
PROFESSIONALS talk of animals that understand training so well they eventually use it back on the trainer. My animal did the same. When the training techniques worked so beautifully, I couldn't resist telling my husband what I was up to. He wasn't offended, just amused. As I explained the techniques and terminology, he soaked it up. Far more than I realized.
Last fall, firmly in middle age, I learned that I needed braces. They were not only humiliating, but also excruciating. For weeks my gums, teeth, jaw and sinuses throbbed. I complained frequently and loudly. Scott assured me that I would become used to all the metal in my mouth. I did not.
One morning, as I launched into yet another tirade about how uncomfortable I was, Scott just looked at me blankly. He didn't say a word or acknowledge my rant in any way, not even with a nod.
I quickly ran out of steam and started to walk away. Then I realized what was happening, and I turned and asked, "Are you giving me an L. R. S.?" Silence. "You are, aren't you?"
He finally smiled, but his L. R. S. has already done the trick. He'd begun to train me, the American wife.
This actually bothers me less than the original, simply because the stereotype of "properly raised wife having to train her lower-status husband to act appropriately" is a VERY common social meme, whereas "husband training wife" is something I generally only see in the context of physical abuse (which, given the lack of violence, this obviously isn't).
Is there a cultural meme I'm missing here that makes THIS version the more offensive one? o.o
"Woman Training Man" is generally presented as funny with no negative ramifications. "Husband training wife" is presented in the context of either physical abuse, emotional abuse, or as part of a widespread societal trend of women being "domesticated" which is now generally considered distasteful. If this had been phrased "husband training wife", it wouldn't pattern match to "funny, harmless joke", it'd pattern-match to either abuse or societal oppression. (The abuse angle wouldn't necessarily be accurate, but for many people it would come to mind before the "mirror-image-of-the-woman-training-man" concept did).
So whether it actually makes sense, the example would produce negative affect in many people.
And no, there is no female privilege, and if you have a misunderstood word, go read feminism 101 until you accept it.
I seem to recall having seen at least one introduction to feminism which did acknowledge that there are forms of female privilege (e.g. children usually end up with the mother after divorces), even though far fewer than forms of male privilege (their list was about an order of magnitude shorter). (This made me find that introduction much more credible, as otherwise it would have failed Policy Debates Should Not Appear One-Sided.)
What you're missing is that many people will respond to the gender-swapped version differently, and Konk is calling attention to that fact.
Thank you Luke for this beautifully written post.
A while ago I saw a kindly waitress give my friend's two year old daughter a small cookie in a restaurant. Various emotions flickered across her tiny face, and then she made a decision, accompanied by a small smile.
She broke the cookie into three pieces and gave them to her brothers. Completely unprompted.
I couldn't believe my eyes. I asked my friend, who is a lecturer in experimental psychology, whether altruism was normal amongst very young siblings.
He looked a bit smug and said "Well we put a lot of reinforcement into that."
I hadn't really thought about what that meant until now. Your clear writing has made it obvious.
As a result of your post, I think I'm going to try deliberately modifying some of my own behaviours this way, and maybe try the techniques on some friends. (The first time, by the way, that I've changed my behaviour as a result of reading less wrong, rather than just treating it as philosophical crack.)
For friends it seems that sincere praise / avoiding criticism would be good, but what would you recommend as rewards to self? I'm pretty sure that nicotine and pizza slices would work for me, but I'm also sure that those aren't things I want to do more of.
For friends it seems that sincere praise / avoiding criticism would be good, but what would you recommend as rewards to self? I'm pretty sure that nicotine and pizza slices would work for me, but I'm also sure that those aren't things I want to do more of.
M&Ms, one piece at a time -- they are small enough. (It would probably be good if you stop eating them in all other circumstances, but that is not big sacrifice.)
Or try a symbolic reward. For example put on your table two glass boxes, put 100 stones in first one, and every time you want to reward yourself, move one stone from the first box to the second one, and congratulate yourself on progress. When all stones are in the second box, give yourself a big reward (pizza or whatever), change the boxes, and start again. (This way the reward is still linked to pizza, but it is less pizza. And you see your progress all the time.)
This compliment is particularly effective because it's specific, verifiable, and true. I've never been very good at accepting vague compliments -- I tend to get embarrassed and self-conscious -- but more specific compliments are really nice.
Reason #228 I'm crazy and irrational: Without conscious attention to the reinforcement process, my behaviors are selected for reinforcement almost at random. The process selecting behaviors for reinforcement has tons of steps in it like "Did I happen to glance in the direction of the bag of M&Ms right now?" instead of "Is the thing I'm doing now something I want to reinforce?"
(nods) For my own part, it's frequently worse than random... when I don't attend to what I'm doing, I frequently berate or otherwise punish myself for attempts to achieve a target that fall short of that target, and I'm more likely to do that the more I value achieving the target. Which is a great way to extinguish the behaviors I value.
To help someone improve at dance or sport, ignore poor performance but reward good performance immediately, for example by shouting "Good!" (Buzas & Allyon 1981) The reason you should ignore poor performance if you say "No, you're > doing it wrong!" you are inadvertently punishing the effort. A better response to a mistake would be to reinforce the effort: "Good effort! You're almost there! Try once more."
I've noticed in pilates classes with one specific teacher you get positive feedback in one specific situation - when you're having trouble, and have just barely managed something basic. This leads to the association that whenever you get positive comments you know you're doing badly.
Anecdotally, punishment seems to be a good guilt-releaser, while guilt is dysthymic. Punishment may be effective at snapping someone out of a blue funk and getting them to be responsive to rewards. Guilty people reject rewards. (The above may work better if you are kinked that way.)
Eagerly awaiting "The Power of Punishment".
Particularly good for demonstrating to observers that you have more status and power than the person you are punishing.
The lead article conflates two process: habits and incentives. The very term "reinforcement" dates back to before the distinction was well-understood. Only in the last decade has it been known that habit operates from a neurology distinct from incentives. (The habit mechanism is in a much older part of the brain.) Only the first story, Yudkowsky and the jellybeans, deals clearly with reinforcement of habit. The others are probably primarily adjustment of incentives.
In using habit and incentive, different rules apply. Incentives require that the subject discern the contingency. The processes Skinner studied as "reinforcement" are mostly about incentives. You adjust schedules of reinforcement to alter the organism's expectancies. For incentive effects, consistent reinforcement is not usually best, as the results are subject to extinction soon after the organism stops getting the reward.
Habits, on the other hand, are blind. The organism doesn't need to see any contingency. Yudkowsky continued to be nice even after he no longer received the jellybeans. To form habits, as opposed to incentive structures, consistency is key.
In short, as a general rule, you want consi...
That's why I tried to stay positive when talking about the new SI website. Especially with technical changes like that, the (vocal) negative response can be overwhelming.
Yup. When reading through the comments about the new website, I could feel my effort being punished.
Yup. When reading through the comments about the new website, I could feel my effort being punished.
Perhaps you could have somebody read them for you and summarize them in a non-critical way, thus creating a reinforcement shield.
Alternately, you could adapt what internet marketing "personalities" do, and promote doing: practice celebrating criticism. One marketer (I forget which one) described making a practice of throwing his hands in the air and shouting "Woo!" when he received a criticism via email.
(Background: "personality" marketers promote by writing emotionally charged material that's intended to divide their audience into people who either love or hate them. Thus, the presence of hate mail is evidence that their strategy is working. They will then often publicize the hate mail, in order to stir up the emotions of the people on the opposite side of the debate. Talk radio hosts, bloggers, political commentators, etc. also use these strategies, even if they're not always considered "marketers" in a traditional sense. Whether you consider this "dark arts" is largely a political question, since the LW sequences use these tactics also. Whether he knows it or not, Eliezer is a personality marketer in this sense, it's just that he's not as efficiently monetizing the results. ;-) )
I sent an email to Nickolai or Kamil asking them to fix X.
Great work Nickolai or Kamil, if either of you read lesswrong at all. The website is a much needed improvement! ;)
To help someone improve at dance or sport, ignore poor performance but reward good performance immediately, for example by shouting "Good!" (Buzas & Allyon 1981) The reason you should ignore poor performance if you say "No, you're doing it wrong!" you are inadvertently punishing the effort. A better response to a mistake would be to reinforce the effort: "Good effort! You're almost there! Try once more."
I got a demonstration of how true this is yesterday when, during my taekwondo class, I was paired up with one of the senior black belt students, who has some but not a lot of experience teaching. He was supposed to be fixing up my poomsae (same thing as a kata in karate) and each time he watched me do it, I would finish and he would immediately launch into a description of what I was doing wrong. His feedback was pretty useful–specific, with demonstrations of exactly what to change in order to do it right–but without any prelude of "yay, good job!" or even "okay, the punches were way better that time...now let's work on the stances", I found myself getting really discouraged. Reminding myself that I wasn't actually doing wors...
I'm not sure a lot of praise is a good idea since that would lower its effectiveness as a reinforcer.
Well, a lot of non-specific praise would water down the value of non-specific praise as a reinforcer, but taking the time to pick out more specific elements that are good/improving would probably reduce discouragement.
I think one of the things I forget most as an instructor is how easy it is to get discouraged, especially when you're being taught by someone who seems to be able to do all of it effortlessly. There's also the element of "I already know I'm doing it wrong! I just can't get my body to listen to my brain!" Instructors who don't acknowledge this and give praise for trying or noticing that I'm doing it wrong are a major source of discouragement for any new physical skill I try to learn.
I read this post last night. I was in the office late, not because I had a great deal to do, but because I was procrastinating. After reading it, I asked my friend to give me a quick call to say congratulations in a half an hour if I'd finished all the work. It took me 10 minutes to finish! :)
But if you aren't treating humans more like animals than most people are, then you're modeling humans poorly.
Thanks for pointing out this particular low-hanging fruit.
Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."
I wonder if they had just (re-)watched this Big Bang Theory episode.
you don't get a sea lion to balance a ball on the end of its nose by nagging
Hmm, I better keep this in mind at all times when dealing with my family.
Nice post SIAI! Have an $5 donation!
I tried a similar reinforcement technique on myself but it didn't stick because I couldn't find a reliable trigger condition for dispensing the reward.
Does this mean that we should stop punishing ourselves for procrastination?
Does this mean that we should stop punishing ourselves for procrastination?
My personal experience strongly suggests that "stop punishing yourself for X" helps avoid X, for most if not all X. For instance, becoming a vegetarian was much easier when I didn't try to go cold turkey, but rather was fine with the fact that I would succumb to the lure of eating meat every now and then. When I did, I felt a little guilty, but then shrugged and thought that I'd try better the next time. I still fall victim to that temptation occasionally, but it's much more rare now than it used to be.
This might have something to do with the fact that if you punish yourself for trying and failing, you stop wanting to try in the first place, as it becomes associated with the negative emotions. Also, accepting and being okay with the occasional failure makes you treat it as a genuine choice where you have agency, not something that you're forced to do against your will.
On Skype with Eliezer, I said: "Eliezer, you've been unusually pleasant these past three weeks. I'm really happy to see that, and moreover, it increases my probability than an Eliezer-led FAI research team will work. What caused this change, do you think?"
Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."
If I recall my high school psychology class correctly, you can get a stronger and more persistent effect by secretly rolling a dice and note the number, and when Eliezer says that many nice things, give him an M&M, roll the dice again for a new target number of nice things.
That's true and false. Intermittent reinforcement gets a more robust effect than continual reinforcement, yes, but randomly intermittent reinforcement isn't as effective as setting the reward threshold higher as the behavior becomes more common... e.g., rewarding only the 10% nicest things.
I want to design a reinforcement schedule in one of our apps. Can anyone link me to some specific guidelines on how to optimise this?
(Reinforce exactly what % of successes (30%? 26%? 8%?)? Reinforce performances in the top 10% of past performances (or the top 12%, or the top 8%?)? How does time factor (if the user hasn't used the app for a week, should I push a reinforcer forward?)?)
Daniel Kahneman in Thinking, Fast and Slow:
I had stumbled onto a significant fact of the human condition: the feedback to which life exposes us is perverse. Because we tend to be nice to other people when they please us and nasty when they do not, we are statistically punished for being nice and rewarded for being nasty.
There reason for that lies in regression to the mean when training (example of flight instructors in the israel airforce):
I pointed out to the instructors that what they saw on the board coincided with what we had heard about the performance of aerobatic maneuvers on successive attempts: poor performance was typically followed by improvement and good performance by deterioration, without any help from either praise or punishment.
Since positive reinforcement is so counterintuitive: don't forget to reward yourself for rewarding somebody for good behaviour! :)
Speaking of regression to the mean, that seems to be one topic that wasn't really covered in the sequences that really should have been.
"Don't Shoot the Dog" remains my favorite book for these sorts of anecdotes, as well as some of the theory and a lot of the practice. I recommend it.
The central lesson I learned from exotic animal trainers is that I should reward behavior I like and ignore behavior I don't. After all, you don't get a sea lion to balance a ball on the end of its nose by nagging. The same goes for the American husband.
Back in Maine, I began thanking Scott if he threw one dirty shirt into the hamper. If he threw in two, I'd kiss him. Meanwhile, I would step over any soiled clothes on the floor without one sharp word, though I did sometimes kick them under the bed. But as he basked in my appreciation, the piles became smaller.
My wife, if pulling that kind of stunt, would quickly find that her affections were shunned and her thanks were met with clear contempt (after she was asked politely not to do that the first time). It is almost certainly not in her interests to produce a pavlovian association between her affections and attempts to control me against my wishes. My aversion to hostile takeover of internal motivations is much stronger than my desire for the affections of any particular individual.
This would be entirely different if I had made a prior agreement regarding shirts and hampers. Making it motivationally easier and more enjoyable to do things I am willing to do is to be encouraged.
Some people react quite viscerally to the awareness that another party is trying intentionally to steer their behavior in any way. It seems to just be a massive squick button for some (indeed, I notice that most randomly-selected people who are made aware of explicit attempts to condition behavior react with discomfort at minimum); for others, there seems to be a correlation with triggers gained from abusive interactions earlier in life; a few I knew who reacted strongly showed strong indications of sociopathy and seemed to instinctively feel violated if someone else successfully, or even just obviously, tried to affect their behavior in a deliberate manner toward some end (a normal part of cognition and social interaction for them directed at others).
I find the idea of endorsing manipulative behavior if and only if I remain unaware of the fact that it's manipulative behavior deeply troubling.
It strikes me as similar to saying that hurting people is OK as long as I don't know I'm hurting them. No, it isn't. If hurting people is not OK, then it follows that I ought not hurt people, and learning to recognize when I'm hurting people is part of that, and I ought to learn to recognize it. The behavior doesn't suddenly become "not OK" the moment I learn to recognize it... it never was OK, and now I know it and can improve.
Conversely, if hurting people is OK, then it's OK whether I know I'm doing it or not.
The same goes for manipulating people. Whether I know I'm doing it or not isn't the determiner of whether I'm doing good or ill.
To my mind, the determiner of whether I'm doing good or ill is whether, when I'm done doing it, we're all better off or worse off.
It may be worth sharing, anecdotally, that years ago my husband expressed annoyance with me over the fact that I only ever rubbed his back while he was doing dishes, and it made him feel much like how wedrifid describes.
This utterly bewildered me, so I agreed to pay attention to the behavior and see what was going on. Pretty quickly it became clear to me that this was absolutely true, for reasons I wasn't entirely clear on myself, although my working theory was it was the only time that I'd regularly walk past him while he was hunched over in that particular posture, which apparently served as a "give me a backrub" signal for me, for whatever reason.
My response to this was to start giving him random backrubs at other times, which solved the problem.
My point being that (a) being annoyed by this sort of behavior is not at all unique to wedrifid, and (b) whether the behavior pattern is intentional doesn't necessarily matter very much. (I don't mean to suggest that it doesn't matter to wedrifid; actually, they have made it somewhat clear that it's part of what they're objecting to.)
The reason you should ignore poor performance if you say "No, you're doing it wrong!" you are inadvertently punishing the effort. A better response to a mistake would be to reinforce the effort: "Good effort! You're almost there! Try once more.
I am probably unusual in this regard, but I think I would find both approaches equally aggravating. If someone points out that I've made a mistake, anything other than a concise detailing of exactly how what I did differs from what I was supposed to do, is just going to irritate me. Also, my brain tends to interpret being ignored as a signal that I'm doing correctly.
So, reinforcement with M&Ms doesn't translate into an addiction for extrinsic rewards and the reduction of intrinsic motivation?
I'm missing something here, I know.
What expert timing, Luke! Just two days ago, I came across the fascinating practice of clicker training for horses - http://www.theclickercenter.com, while reading Kathy Sierra's old blog - http://headrush.typepad.com/creating_passionate_users/2006/03/clicker_trained.html.
My only problem is that I need to train my own behaviour rather than someone else's. I'm going to try to use these techniques on myself, although I'm not sure if that's supposed to work.
Attacking your opponent's intelligence is just that, regardless of the terminology you dress it up in. That your opinion is the rational one, and that those who disagree with it are less rational than you, is obviously the position of anyone who makes an argument. Steering the conversation in that direction adds nothing.
On Skype with Eliezer, I said: "Eliezer, you've been unusually pleasant these past three weeks. I'm really happy to see that, and moreover, it increases my probability than an Eliezer-led FAI research team will work. What caused this change, do you think?"
Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."
Made me smile. Thanks for sharing.
Too infrequent. They need to start by giving him an M&M every time he thinks about writing more HPMoR.
Thanks, Luke! I've always enjoyed this sequence. (It's funny that I was tempted to include a note that I would've been happier if you contributed to the sequence more often, but let's stick with the praise for now. :-)
Wow, thanks for this great article that was the final piece of information that tipped me over towards getting my shit together. Within 10 minutes after reading it and browsing the comments, I was on my bicycle going to buy small treats I like, that I now give myself for every achieved small goal (~2-10 min of work).
I now wonder though if maybe I should give myself another reinforcer when starting to work with a new goal, otherwise maybe I will only strive for finishing as fast as possible, but starting with a new small goal won't be that much reinforced? Maybe this is my mind trying to get more candy though, so I would be thankful for outside perspective.
Lessons learned:
continue to mentally /ignore people and posts I don't care for on IRC and online forums
never comment on bad posts or explain my downvote on LW
be more generous with upvoting good contributions and give a short praise when warranted.
never comment on bad posts or explain my downvote on LW
This is not quite justified; this is a post on how to use positive reinforcement, not how to use punishment.
This seems to contradict the very powerful effect of learning from failure and corrective feedback. See http://www.wired.com/wiredscience/2011/10/why-do-some-people-learn-faster-2/ for an accessible overview.
I'd conjecture this works better when someone can already perform the desired behavior and wants to form a habit, whereas learning from failure comes in when new information needs to be stored and reorganized.
I think next time I go shopping, I'll buy a pack of M&Ms, and take one whenever I make a git commit.
Excellent article. I wonder if reinforcement could be used to speed up rationality training? I would love to see a study done on that.
I just read Don't Shoot The Dog, and one of the interesting bits was that it seemed like getting trained the way it described was fun for the animals, like a good game. Also as the skill was learnt the task difficulty level was raised so it wasn't too easy. And the rewards seemed somewhat symbolic - a clicker, and being fed with food that wasn't officially restricted outside the training sessions.
Thinking about applying it to myself, having the reward not be too important outside the game/practise means I'm not likely to want to bypass the game to get the ...
Does this still work if I reinforce myself? Every time I read 5 lesswrong articles in a day, I give myself a reward. Or every time i have a cigarette, I kick a brick wall with no shoes on. If i was consistent with this for a long time, would it work?
related: http://gettingstronger.org/2012/01/hormesis-and-the-limbic-brain/
"Reprogramming the amygdala. This is the indirect way to re-program the hypothalamus, by altering the amygdaloid reward circuitry that feeds it. There are a number approaches to achieving this, some of which I’ve outlined in previous articles, but all of them fall generally under the umbrella of classical or Pavlovian conditioning. There are a few basic strategies:
Extinction. An addictive response becomes weaker and eventually dies out when you stop responding to a triggerin
... This post may have the highest upvotes per comment I've ever seen. Anyone got access to the database want to confirm that?
Note that there are many circumstances when it is right to criticise. For instance group brainstorming exercises are more productive if the participants criticise each others ideas.
If this genuinely looks like love bombing then it could be an indication that you need more affection in your life to recalibratethe the base rate.
Oh, probably. I hear Luke has more real-life charisma... Though he kind of kills the "fosters a distrust of outside sources" with the amount he cites outside sources.
Quite a lot of charisma, but nothing near the level a cult leader would need to pull off a personality cult. (Although he could probably make up for this if he really wanted to by spending a few weeks reading up research on cult formation then applying it systematically as a 'how to' guide.)
Maybe it's because behaviorist techniques like reinforcement feel like they don't respect human agency enough. But if you aren't treating humans more like animals than most people are, then you're modeling humans poorly.
But treating human beings, especially adults, like animals is characteristically unethical. Applying some system of reinforcement where someone has asked you to effectively treat their behavior is innocuous enough, as is of course treating yourself.
But generally manipulating the behavior of other people by means other than convincing th...
But treating human beings, especially adults, like animals is characteristically unethical.
It seems to me like the flow is in the reverse direction: many unethical manipulations involve treating adults like animals. But people who skillfully use positive reinforcement are both more pleasant to be around and more effective- which seems like something ethical systems should point you towards, not away from.
I agree with you that your autonomy is threatened by the manipulations of others. But threats only sometimes turn into harm- distinguishing between manipulations you agree with and disagree with is a valuable skill.
Indeed, there's a general point that needs to be made about human interaction, and another about status, but first a recommendation: try to view as many of your actions as manipulations as possible. This will help separate out the things that, on reflection, you want to do and the things that, on reflection, you don't want to do. For example:
if a friend told me that he spent a lot of our time together thinking through ways to positively reinforce some of my behaviors, even to my benefit, I would become very suspicious of him. I would feel that I'd been treated as a child or a dog. His behavior would seem to me to be manipulative and dishonest,
Emphasis mine. The reaction- of calling his behavior manipulative and dishonest- feels like it punishes manipulation, which you might want to do to protect your autonomy. But it actually punishes honesty, because the trigger was your friend telling you! Now, if your friend wants to change you, they'll need to try to do it subtly...
distinguishing between manipulations you agree with and disagree with is a valuable skill.
This, with extra emphasis!
But treating human beings, especially adults, like animals is characteristically unethical.
This statement without context is clearly incorrect; there are all sorts of behaviors we can ethically execute with respect to both humans and other animals. I understand that what you and the OP both mean to connote is particular behaviors which we restrict in typical contexts only to non-human animals, but if you're going to label them as unethical when applied to humans it helps to specify what behaviors and context those are.
manipulating the behavior of other people by means other than convincing them that they should behave in a certain way seems to me to be almost definitional of a dark art.
That's a little more specific, but not too much, as I'm not really sure what you mean by "convincing" here.
That is, if at time T1 I don't exhibit behavior B and don't assert that I should exhibit B, and you perform some act A at T2 after which I exhibit B and assert that I should exhibit B, is A an act of convincing me (and therefore OK on your account) or not (and therefore unethical on your account)? How might I test that?
never do this to other people without their explicit consent
This, on the other hand, is clear. Thank you.
I disagree with it strongly.
This article implicitly positively reinforces positive reinforcement and negatively reinforces negative reinforcement. But there are situations in which negative reinforcement should be positively reinforced, e.g. if this article is in fact correct to negatively reinforce negative reinforcement. The article thus implicitly contradicts itself.
Yes, in the should-world we could've all learned to avoid putting our hands on hot stovetops simply by getting an M&M for every hour we managed to avoid putting our hands on hot stovetops. In the real world, learn...
Part of the sequence: The Science of Winning at Life
Also see: Basics of Animal Reinforcement, Basics of Human Reinforcement, Physical and Mental Behavior, Wanting vs. Liking Revisited, Approving reinforces low-effort behaviors, Applying Behavioral Psychology on Myself.
Story 1:
On Skype with Eliezer, I said: "Eliezer, you've been unusually pleasant these past three weeks. I'm really happy to see that, and moreover, it increases my probability than an Eliezer-led FAI research team will work. What caused this change, do you think?"
Eliezer replied: "Well, three weeks ago I was working with Anna and Alicorn, and every time I said something nice they fed me an M&M."
Story 2:
I once witnessed a worker who hated keeping a work log because it was only used "against" him. His supervisor would call to say "Why did you spend so much time on that?" or "Why isn't this done yet?" but never "I saw you handled X, great job!" Not surprisingly, he often "forgot" to fill out his worklog.
Ever since I got everyone at the Singularity Institute to keep work logs, I've tried to avoid connections between "concerned" feedback and staff work logs, and instead take time to comment positively on things I see in those work logs.
Story 3:
Chatting with Eliezer, I said, "Eliezer, I get the sense that I've inadvertently caused you to be slightly averse to talking to me. Maybe because we disagree on so many things, or something?"
Eliezer's reply was: "No, it's much simpler. Our conversations usually run longer than our previously set deadline, so whenever I finish talking with you I feel drained and slightly cranky."
Now I finish our conversations on time.
Story 4:
A major Singularity Institute donor recently said to me: "By the way, I decided that every time I donate to the Singularity Institute, I'll set aside an additional 5% for myself to do fun things with, as a motivation to donate."
The power of reinforcement
It's amazing to me how consistently we fail to take advantage of the power of reinforcement.
Maybe it's because behaviorist techniques like reinforcement feel like they don't respect human agency enough. But if you aren't treating humans more like animals than most people are, then you're modeling humans poorly.
You are not an agenty homunculus "corrupted" by heuristics and biases. You just are heuristics and biases. And you respond to reinforcement, because most of your motivation systems still work like the motivation systems of other animals.
A quick reminder of what you learned in high school
What works
Example applications
For additional examples and studies, see The Power of Reinforcement (2004), Don't Shoot the Dog (2006), and Learning and Behavior (2008).
I close with Story 5, from Amy Sutherland:
Next post: Rational Romantic Relationships Part 1
Previous post: The Good News of Situationist Psychology
My thanks to Erica Edelman for doing much of the research for this post.