JamesAndrix comments on Open Thread: July 2009 - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (235)
At the risk of providing a non-answer I'll say: Operant conditioning.
The test problem, the solving of it, and getting an answer correspond to a light coming on, pressing a lever, and getting food.
We've long since been trained that solving problems in that context build up token points that will pay out later in praise and promises of money.
Presumably this training translates fairly well to real world problems.
Indeed, that's the conclusion I came to. What I wonder now is how we operant-condition ourselves without just reinforcing reinforcement itself. Which, I suppose, is more or less precisely what the Friendly AI problem is.