TheOtherDave comments on The Power of Reinforcement - Less Wrong

96 Post author: lukeprog 21 June 2012 01:42PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (467)

You are viewing a single comment's thread. Show more comments above.

Comment author: TheOtherDave 21 June 2012 07:17:44PM 1 point [-]

I can't, but if you find anything concise and useful, I'd love to hear about it myself.

My rule of thumb is to set the threshold so as to reinforce the top 20% or so of performances, and arrange performance frequencies so I'm reinforcing 2-3 times/minute during active training periods. But that's not based on anything.

I'll also note that reinforcing higher-tier performances more strongly works really well (though is hard to do consistently by hand), as do very intermittent "jackpots" (disproportional and unpredictable mega-rewards).