PhilGoetz comments on Connectionism: Modeling the mind with neural networks - Less Wrong

39 Post author: Yvain 19 July 2011 01:16AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (20)

You are viewing a single comment's thread.

Comment author: PhilGoetz 19 July 2011 11:56:28PM *  3 points [-]

Motivational links, however, could be modified by reinforcement. If a certain action leads to reward, strengthen the links that led to that action; if it leads to punishment, strengthen the links that would have made you avoid that action.

Reward comes along too much later for this to work for humans. Instead, the brain uses temporal difference learning. I no longer remember what was the first, classic paper demonstrating temporal difference error signals in the brain; it may have been A Neural Substrate of Prediction and Reward (1997). Google ("temporal difference learning", brain). "Temporal Difference Models and Reward-Related Learning in the Human Brain" , Neuron, 2003, will be one of the hits.

Comment author: Yvain 20 July 2011 09:54:32AM 0 points [-]

I agree that the brain uses temporal difference learning. I thought temporal difference learning was that reward propagates back to earliest reliable stimulus based on difference between expected and observed, then reinforces it. How is that different from the quoted text except that quoted is simpler and doesn't use that language?