Armok_GoB comments on The Power of Reinforcement - Less Wrong

96 Post author: lukeprog 21 June 2012 01:42PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (467)

You are viewing a single comment's thread. Show more comments above.

Comment author: Armok_GoB 21 June 2012 02:25:59PM 1 point [-]

Hmm, I wonder if providing a lot of negative reinforcement on some attribute of them you don't care about would make the positive reinforcements more effective on the things you do care about.

Example: trying to teach someone math, and praising them at everything they do right with the math, including trying, but complain abut their physique, fashion choices, hygiene, etc. Especially timing those unrelated complaints to when they seem less focused on the math but subtly enough they don't consciously notice the correlation.

Not that this isn't a bad idea for other unrelated reasons...

Comment author: TheOtherDave 21 June 2012 03:24:48PM 2 points [-]

There's a couple of factors here worth keeping in mind.

One is that classical conditioning continues to work, even when I'm concentrating on operant conditioning. So one result of this strategy is that my target will come to associate me with aversive stimuli, which will in turn reduce the effectiveness of my attempts at reinforcement. They will similarly associate the teaching sessions and math with those stimuli, which may be counterproductive.

Another is that a target consciously noticing my attempts at conditioning changes the whole ball game, in ways I don't entirely understand and I'm not sure are entirely understood. Sometimes it's a huge win. Sometimes it's a huge lose. Staying subtle is more predictable, if I can do it, but of course it's not always possible to avoid detection, and sometimes it's better to admit to my attempts at conditioning than to be caught out at them. The safest move is to first establish a social context where my attempts at conditioning can be labelled "manners," such that any attempt to call me out on them is inherently low-status, but that's not always possible either.

When using praise signals as reinforcers for systems, like some humans, who are capable of skepticism about my motives, it helps to be seen to use expensive signals. (Attention often works well, which is one reason Internet trolls are so persistent.) Of course, that typically means I have to invest resources into my conditioning efforts.

In general, the approach I endorse is to maintain (and adjust as needed) a consistent threshold of evaluation, ignore behavior that falls below that threshold, reward behavior that clears it, and resist the temptation to go meta about the process.

Comment author: [deleted] 21 June 2012 05:47:50PM 1 point [-]

Sounds like an interesting idea for an experiment, although it would probably violate ethical guidelines. :P

Comment author: wedrifid 21 June 2012 02:38:04PM *  1 point [-]

Hmm, I wonder if providing a lot of negative reinforcement on some attribute of them you don't care about would make the positive reinforcements more effective on the things you do care about.

The example you give is either punishment of the other attributes or negative reinforcement of the desired behavior (if you look at it from the perspective of taking away the aversive stimulus only when the math is done.)