Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Physical and Mental Behavior

48 Post author: Yvain 10 July 2011 08:20PM

B.F. Skinner called thoughts "mental behavior". He believed they could be rewarded and punished just like physical behavior, and that they increased or declined in frequency accordingly.

Sadly, psychology has not yet advanced to the point where we can give people electric shocks for thinking things, so the sort of rewards and punishments that reinforce thoughts must be purely internal reinforcement. A thought or intention that causes good feelings gets reinforced and prospers; one that causes bad feelings gets punished and dies out.

(Roko has already discussed this in Ugh Fields; so much as thinking about an unpleasant task is unpleasant; therefore most people do not think about unpleasant tasks and end up delaying them or avoiding them completely. If you haven't already read that post, it does a very good job of making reinforcement of thoughts make sense.)

A while back, D_Malik published a great big List Of Things One Could Do To Become Awesome.  As David_Gerard replied, the list was itself a small feat of awesome. I expect a couple of people started on some of the more awesome-sounding entries, then gave up after a few minutes and never thought about it again. Why?

When I was younger, I used to come up with plans to become awesome in some unlikely way. Maybe I'd hear someone speaking Swahili, and I would think "I should learn Swahili," and then I would segue into daydreams of being with a group of friends, and someone would ask if any of us spoke any foreign languages, and I would say I was fluent in Swahili, and they would all react with shock and tell me I must be lying, and then a Kenyan person would wander by, and I'd have a conversation with them in Swahili, and they'd say that I was the first American they'd ever met who was really fluent in Swahili, and then all my friends would be awed and decide I was the best person ever, and...

...and the point is that the thought of learning Swahili is pleasant, in the same easy-to-visualize but useless way that an extra bedroom for Grandma is pleasant. And the intention to learn Swahili is also pleasant, because it will lead to all those pleasant things.  And so, by reinforcement of mental behavior, I continue thinking about and intending to learn Swahili.

Now consider the behavior of studying Swahili. I've never done so, but I imagine it involves a lot of long nights hunched over books of Swahili grammar. Since I am not one of the lucky people who enjoys learning languages for their own sake, this will be an unpleasant task. And rewards will be few and far between: outside my fantasies, my friends don't just get together and ask what languages we know while random Kenyans are walking by.

In fact, it's even worse than this, because I don't exactly make the decision to study Swahili in aggregate, but only in the form of whether to study Swahili each time I get the chance. If I have the opportunity to study Swahili for an hour, this provides no clear reward - an hour's studying or not isn't going to make much difference to whether I can impress my friends by chatting with a Kenyan - but it will still be unpleasant to spend an hour of going over boring Swahili grammar. And time discounting makes me value my hour today much more than I value some hypothetical opportunity to impress people months down the line; Ainslie shows quite clearly I will always be better off postponing my study until later.

So the behavior of actually learning Swahili is thankless and unpleasant and very likely doesn't happen at all.

Thinking about studying Swahili is positively reinforced, actually studying Swahili is negatively reinforced. The natural and obvious result is that I intend to study Swahili, but don't.

The problem is that for some reason, some crazy people expect for the reinforcement of thoughts to correspond to the reinforcement of the object of those thoughts. Maybe it's that old idea of "preference": I have a preference for studying Swahili, so I should satisfy that preference, right? But there's nothing in my brain automatically connecting this node over here called "intend to study Swahili" to this node over here called "study Swahili"; any association between them has to be learned the hard way.

We can describe this hard way in terms of reinforcement learning: after intending to learn Swahili but not doing so, I feel stupid. This unpleasant feeling propagates back to its cause, the behavior of intending to learn Swahili, and negatively reinforces it. Later, when I start thinking it might be neat to learn Mongolian on a whim, this generalizes to behavior that has previously been negatively reinforced, so I avoid it (in anthropomorphic terms, I "expect" to fail at learning Mongolian and to feel stupid later, so I avoid doing so).

I didn't learn this the first time, and I doubt most other people do either. And it's a tough problem to call, because if you overdo the negative reinforcement, then you never try to do anything difficult ever again.

In any case, the lesson is that thoughts and intentions get reinforced separately from actions, and although you can eventually learn to connect intentions to actions, you should never take the connection for granted.

Comments (22)

Comment author: lessdazed 12 July 2011 11:43:47AM 15 points [-]

Sadly, psychology has not yet advanced to the point where we can give people electric shocks for thinking things

So long as one could convince someone psychology (or magic) has advanced to the point where we can give people electric shocks for thinking things, one could possibly trick people into monitoring their thoughts themselves.

"So, let's test this out. Practice thinking of a jar of pennies and we'll see if it triggers the shock." "OK." bzzzzzzz. "Ow, that hurt!" "Brace yourself next time. It helps. Try now. Brace, but don't think of the pennies." "OK." "Now think of the pennies." bzzzzzzzz "@#$%! You're right, bracing helped a lot" "Good. A few more tests to make sure it works, then you go wear it around all day."

The subject then braces every time he thinks of the pennies, which the monitor detects.

Neither deception nor electric shocks are really necessary, of course. People very often change what mental associations they have without them.

Comment author: Unnamed 13 July 2011 02:20:20AM 5 points [-]

You've just reinvented the bogus pipeline. New and improved (now with electric shocks!).

Comment author: Eliezer_Yudkowsky 12 July 2011 10:01:07PM 7 points [-]

Upvoted for evil brilliance.

Comment author: Clippy 12 July 2011 10:09:20PM 4 points [-]

I thought evil was bad?

Comment author: XFrequentist 12 July 2011 10:18:45PM *  3 points [-]

In the sense that you expect evil to decrease the number of paperclips in the universe, or in some parochial human definition of the word "bad"?

Comment author: Clippy 12 July 2011 10:39:02PM 5 points [-]

The second -- my confusion was about the human's appraisal of evil as meriting upvoting.

Comment author: NancyLebovitz 12 July 2011 06:55:46AM 8 points [-]

Effort shock-- the unpleasant discovery of how hard it is to accomplish something.

Comment author: Kaj_Sotala 11 July 2011 03:10:01PM 4 points [-]

A link to PJ Eby's Instant Motivation technique seems relevant here, as it's about transferring the imagined pleasure of having completed a task to the motivation to actually complete a task.

Comment author: pjeby 11 July 2011 04:02:23PM 3 points [-]

I was actually thinking about posting this myself, as the technique appears to not fit into the behaviorism model presented here. That is, it would seem that in the technique, what is being reinforced is thinking about a clean desk... which should not then lead to actual desk cleaning.

Comment author: jimmy 11 July 2011 07:00:21PM 2 points [-]

And thinking about food makes people want food more, it doesn't satisfy them. Somehow attaching "wants" not "likes".

If I had to make a couple guesses at the differences, the first would be that Yvain knows that learning Swahili is way more work than its worth for that amount of pleasure and that the reality check keeps him from getting too motivated to do it. Another factor could be that he doesn't go learn Swahili right then like you would for eating or cleaning your desk.

A more speculative guess would be that he's imagining it without somehow inhibiting the signal that the goal is satisfied. Some weak predictions might be things like "I have to remind myself that I didn't actually learn swahili". It seems like this is the angle he's going for, without mention of the possibility of inhibiting that response in other cases.

Comment author: MatthewBaker 11 July 2011 09:19:08PM 0 points [-]

I always enjoy a good guess on a subject to help me understand it better.

Comment author: CronoDAS 10 July 2011 11:10:17PM 1 point [-]

This is why I try my best not to think about my long-term future.

Comment author: orthonormal 11 July 2011 03:19:35PM 4 points [-]

Is that working for you? (Sincere question.)

Comment author: CronoDAS 12 July 2011 02:44:05AM *  3 points [-]

Well, at least some of the time. Thinking about my long-term future is one of the things that will consistently set off a depressive episode, so I try not to do it if I can avoid it.

Comment author: NancyLebovitz 12 July 2011 06:52:58AM 0 points [-]

Is this something new? My impression is that you didn't used to have that sort of self-knowledge.

Comment author: CronoDAS 13 July 2011 01:28:51AM 2 points [-]

No, I've known that since I was about 16, I think...

Comment author: Khaled 11 July 2011 01:11:56PM 1 point [-]

Your pleasant thoughts were about "being able to speak Swahili" rather than "learning Sawahili". Your thoughts were about the joy of the reward, which I guess are not reinfornced in total independence from actions (imagine trying to learn Swahili without the rewarded thoughts, you'd probably not make it through th first few calsses), but are certainly not identical.

What would happen if you think about the effort of actually learning? Will it get negatively reinforced the same way as actually doing the effort?

Comment author: Swimmer963 11 July 2011 11:05:30AM 1 point [-]

We can describe this hard way in terms of reinforcement learning: after intending to learn Swahili but not doing so, I feel stupid. This unpleasant feeling propagates back to its cause, the behavior of intending to learn Swahili, and negatively reinforces it.

I think I've always had this unpleasant feeling back-propagation, to the point that at the start of your article, I thought you must be wrong...surely everyone knows better than to get pleasure from thinking about doing something that isn't yet done! But maybe I'm wrong.

Comment author: kragensitaker 11 August 2011 07:21:14PM *  0 points [-]

The language-learning case is an interesting example. There are some things you can do.

One is that, if you're extraverted, instead of studying Swahili by hunching over books of grammar, you can study Swahili by talking to cute Kenyan exchange students. This way, the actual process of learning itself is enjoyable. (Mostly. You'll still be embarrassed frequently, and it's in your interest to turn the embarrassment dial up further by asking them to correct your grammatical errors.)

Another is that you actually can make the decision to study Swahili "in aggregate," rather than every night. Just go to rural Tanzania for six months and don't take any books or English-speaking friends with you, keep your internet access to a minimum.

This strategy has worked reasonably well for me in learning Spanish, although my Spanish is still pretty unidiomatic. I still speak English often with my wife, and I work online.

To generalize the principles a bit, if you can find a fun way to achieve your approved-of goal, a way that you enjoy, you're more likely to do it; and if you can find a way to make the decision once instead of numerous times, you're more likely to do it.

Comment author: Dr_Manhattan 11 July 2011 12:29:31PM 0 points [-]

I didn't learn this the first time, and I doubt most other people do either. And it's a tough problem to call, because if you overdo the negative reinforcement, then you never try to do anything difficult ever again.

I think the solution to this is already described by you in the Wanting vs. Liking post -

Reinforcement learning doesn't just connect single stimuli to responses. It connects stimuli in a context to responses.

So if I do not want to overgeneralize the negative reinforcement, it seems worthwhile to pay attention to legitimate excuses - "Swahili is cool, but really hard to learn due to dearth of subtitled Swahili movies".

Comment author: Vaniver 11 July 2011 04:54:42AM 0 points [-]

Sadly, psychology has not yet advanced to the point where we can give people electric shocks for thinking things

Can't you rig up an electrode up to the output of a PET scan or FMRI?

Comment author: tut 17 July 2011 04:33:06PM 0 points [-]

There would be a delay of more than 30 seconds, which would more or less remove any reenforcement learning.