So the system needs to draw a distinction between just imagining freedom and making a plan for action that is predicted to actually produce freedom. This seems like something that a critic system can learn pretty easily. It's known that the rodent dopamine system can learn blockers, such as not predicting reward when a blue light comes on at the same time as the otherwise reward-predictive red light.
There are 2 separable problems here: A. can a critic learn new abstract values?; B. how does the critic distinguish reality from imagination? I don... (read more)
I don't think Montague dealt with that issue much if at all. But it's been a long time since I read the book.
My biggest takeaway from Tomasello's work was his observation that humans pay far more attention to other humans than monkeys do to monkeys. Direct reward for social approval is one possible mechanism, but it's also possible that it's some other bias in the system. I think hardwired reward for social approval is probably a real mechanism. But it's also possible that the correlation between people's approval and even more direct reward of food, water, and shelter play a large role in making human approval and disapproval a conditioned stimulus (or a fully "substituted" stimulus). But I don't think that distinction is very relevant for guessing the scope of the critic's association.
I completely agree. This is the basis of my explanation for how humans could attribute value to abstract representations and not wirehead. In sum, a system smart enough to learn about the positive values of several-steps-removed conditioned stimuli can also learn many indicators of when those abstractions won't lead to reward. These may be cortical representations of planning-but-not-doing, or other indicators in the cortex of the difference between reality and imagination. The weaker nature of simulation representations may be enough to distinguish, and it should certainly be enough to ensure that real rewards and punishments always have a stronger influence, making imagination ultimately under the control of reality.
If you've spent the afternoon wireheading by daydreaming about how delicious that fresh meat is, you'll be very hungry in the evening. Something has gone very wrong, in much the same way as if you chose to hunt for game where there is none. In both cases, the system is going to have to learn where the wrong decision was made and the wrong strategy was followed. If you're out of a job and out of money because you've spent months arguing with strangers on the intern
There are 2 separable problems here: A. can a critic learn new abstract values?; B. how does the critic distinguish reality from imagination? I don... (read more)