B.F. Skinner called thoughts "mental behavior". He believed they could be rewarded and punished just like physical behavior, and that they increased or declined in frequency accordingly.

Sadly, psychology has not yet advanced to the point where we can give people electric shocks for thinking things, so the sort of rewards and punishments that reinforce thoughts must be purely internal reinforcement. A thought or intention that causes good feelings gets reinforced and prospers; one that causes bad feelings gets punished and dies out.

(Roko has already discussed this in Ugh Fields; so much as thinking about an unpleasant task is unpleasant; therefore most people do not think about unpleasant tasks and end up delaying them or avoiding them completely. If you haven't already read that post, it does a very good job of making reinforcement of thoughts make sense.)

A while back, D_Malik published a great big List Of Things One Could Do To Become Awesome.  As David_Gerard replied, the list was itself a small feat of awesome. I expect a couple of people started on some of the more awesome-sounding entries, then gave up after a few minutes and never thought about it again. Why?

When I was younger, I used to come up with plans to become awesome in some unlikely way. Maybe I'd hear someone speaking Swahili, and I would think "I should learn Swahili," and then I would segue into daydreams of being with a group of friends, and someone would ask if any of us spoke any foreign languages, and I would say I was fluent in Swahili, and they would all react with shock and tell me I must be lying, and then a Kenyan person would wander by, and I'd have a conversation with them in Swahili, and they'd say that I was the first American they'd ever met who was really fluent in Swahili, and then all my friends would be awed and decide I was the best person ever, and...

...and the point is that the thought of learning Swahili is pleasant, in the same easy-to-visualize but useless way that an extra bedroom for Grandma is pleasant. And the intention to learn Swahili is also pleasant, because it will lead to all those pleasant things.  And so, by reinforcement of mental behavior, I continue thinking about and intending to learn Swahili.

Now consider the behavior of studying Swahili. I've never done so, but I imagine it involves a lot of long nights hunched over books of Swahili grammar. Since I am not one of the lucky people who enjoys learning languages for their own sake, this will be an unpleasant task. And rewards will be few and far between: outside my fantasies, my friends don't just get together and ask what languages we know while random Kenyans are walking by.

In fact, it's even worse than this, because I don't exactly make the decision to study Swahili in aggregate, but only in the form of whether to study Swahili each time I get the chance. If I have the opportunity to study Swahili for an hour, this provides no clear reward - an hour's studying or not isn't going to make much difference to whether I can impress my friends by chatting with a Kenyan - but it will still be unpleasant to spend an hour of going over boring Swahili grammar. And time discounting makes me value my hour today much more than I value some hypothetical opportunity to impress people months down the line; Ainslie shows quite clearly I will always be better off postponing my study until later.

So the behavior of actually learning Swahili is thankless and unpleasant and very likely doesn't happen at all.

Thinking about studying Swahili is positively reinforced, actually studying Swahili is negatively reinforced. The natural and obvious result is that I intend to study Swahili, but don't.

The problem is that for some reason, some crazy people expect for the reinforcement of thoughts to correspond to the reinforcement of the object of those thoughts. Maybe it's that old idea of "preference": I have a preference for studying Swahili, so I should satisfy that preference, right? But there's nothing in my brain automatically connecting this node over here called "intend to study Swahili" to this node over here called "study Swahili"; any association between them has to be learned the hard way.

We can describe this hard way in terms of reinforcement learning: after intending to learn Swahili but not doing so, I feel stupid. This unpleasant feeling propagates back to its cause, the behavior of intending to learn Swahili, and negatively reinforces it. Later, when I start thinking it might be neat to learn Mongolian on a whim, this generalizes to behavior that has previously been negatively reinforced, so I avoid it (in anthropomorphic terms, I "expect" to fail at learning Mongolian and to feel stupid later, so I avoid doing so).

I didn't learn this the first time, and I doubt most other people do either. And it's a tough problem to call, because if you overdo the negative reinforcement, then you never try to do anything difficult ever again.

In any case, the lesson is that thoughts and intentions get reinforced separately from actions, and although you can eventually learn to connect intentions to actions, you should never take the connection for granted.

New to LessWrong?

New Comment
22 comments, sorted by Click to highlight new comments since: Today at 7:44 AM

Sadly, psychology has not yet advanced to the point where we can give people electric shocks for thinking things

So long as one could convince someone psychology (or magic) has advanced to the point where we can give people electric shocks for thinking things, one could possibly trick people into monitoring their thoughts themselves.

"So, let's test this out. Practice thinking of a jar of pennies and we'll see if it triggers the shock." "OK." bzzzzzzz. "Ow, that hurt!" "Brace yourself next time. It helps. Try now. Brace, but don't think of the pennies." "OK." "Now think of the pennies." bzzzzzzzz "@#$%! You're right, bracing helped a lot" "Good. A few more tests to make sure it works, then you go wear it around all day."

The subject then braces every time he thinks of the pennies, which the monitor detects.

Neither deception nor electric shocks are really necessary, of course. People very often change what mental associations they have without them.

You've just reinvented the bogus pipeline. New and improved (now with electric shocks!).

Upvoted for evil brilliance.

I thought evil was bad?

In the sense that you expect evil to decrease the number of paperclips in the universe, or in some parochial human definition of the word "bad"?

The second -- my confusion was about the human's appraisal of evil as meriting upvoting.

Effort shock-- the unpleasant discovery of how hard it is to accomplish something.

A link to PJ Eby's Instant Motivation technique seems relevant here, as it's about transferring the imagined pleasure of having completed a task to the motivation to actually complete a task.

I was actually thinking about posting this myself, as the technique appears to not fit into the behaviorism model presented here. That is, it would seem that in the technique, what is being reinforced is thinking about a clean desk... which should not then lead to actual desk cleaning.

And thinking about food makes people want food more, it doesn't satisfy them. Somehow attaching "wants" not "likes".

If I had to make a couple guesses at the differences, the first would be that Yvain knows that learning Swahili is way more work than its worth for that amount of pleasure and that the reality check keeps him from getting too motivated to do it. Another factor could be that he doesn't go learn Swahili right then like you would for eating or cleaning your desk.

A more speculative guess would be that he's imagining it without somehow inhibiting the signal that the goal is satisfied. Some weak predictions might be things like "I have to remind myself that I didn't actually learn swahili". It seems like this is the angle he's going for, without mention of the possibility of inhibiting that response in other cases.

I always enjoy a good guess on a subject to help me understand it better.

Your pleasant thoughts were about "being able to speak Swahili" rather than "learning Sawahili". Your thoughts were about the joy of the reward, which I guess are not reinfornced in total independence from actions (imagine trying to learn Swahili without the rewarded thoughts, you'd probably not make it through th first few calsses), but are certainly not identical.

What would happen if you think about the effort of actually learning? Will it get negatively reinforced the same way as actually doing the effort?

We can describe this hard way in terms of reinforcement learning: after intending to learn Swahili but not doing so, I feel stupid. This unpleasant feeling propagates back to its cause, the behavior of intending to learn Swahili, and negatively reinforces it.

I think I've always had this unpleasant feeling back-propagation, to the point that at the start of your article, I thought you must be wrong...surely everyone knows better than to get pleasure from thinking about doing something that isn't yet done! But maybe I'm wrong.

The language-learning case is an interesting example. There are some things you can do.

One is that, if you're extraverted, instead of studying Swahili by hunching over books of grammar, you can study Swahili by talking to cute Kenyan exchange students. This way, the actual process of learning itself is enjoyable. (Mostly. You'll still be embarrassed frequently, and it's in your interest to turn the embarrassment dial up further by asking them to correct your grammatical errors.)

Another is that you actually can make the decision to study Swahili "in aggregate," rather than every night. Just go to rural Tanzania for six months and don't take any books or English-speaking friends with you, keep your internet access to a minimum.

This strategy has worked reasonably well for me in learning Spanish, although my Spanish is still pretty unidiomatic. I still speak English often with my wife, and I work online.

To generalize the principles a bit, if you can find a fun way to achieve your approved-of goal, a way that you enjoy, you're more likely to do it; and if you can find a way to make the decision once instead of numerous times, you're more likely to do it.

Sadly, psychology has not yet advanced to the point where we can give people electric shocks for thinking things

Can't you rig up an electrode up to the output of a PET scan or FMRI?

There would be a delay of more than 30 seconds, which would more or less remove any reenforcement learning.

I didn't learn this the first time, and I doubt most other people do either. And it's a tough problem to call, because if you overdo the negative reinforcement, then you never try to do anything difficult ever again.

I think the solution to this is already described by you in the Wanting vs. Liking post -

Reinforcement learning doesn't just connect single stimuli to responses. It connects stimuli in a context to responses.

So if I do not want to overgeneralize the negative reinforcement, it seems worthwhile to pay attention to legitimate excuses - "Swahili is cool, but really hard to learn due to dearth of subtitled Swahili movies".

This is why I try my best not to think about my long-term future.

Is that working for you? (Sincere question.)

Well, at least some of the time. Thinking about my long-term future is one of the things that will consistently set off a depressive episode, so I try not to do it if I can avoid it.

Is this something new? My impression is that you didn't used to have that sort of self-knowledge.

No, I've known that since I was about 16, I think...