How would that work?
Well that's the quadrillion dollar question. I have no idea how to solve it.
It's certainly not impossible as humans seem to work this way. We can also do it in toy examples. E.g. a simple AI which has an internal universe it tries to optimize, and it's sensors merely update the state it is in. Instead of trying to predict the reward, it tries to predict the actual universe state and selects the ones that are desirable.
How would that [valuing universe-states themselves] work? Well that's the quadrillion dollar question. I have no idea how to solve it.
Yeah, I think this whole thread may be kind of grinding to this conclusion.
It's certainly not impossible as humans seem to work this way
Seem to perhaps, but I don't think that's actually the case. I think (as mentioned above) that we value reward signals terminally (but are mostly unaware of this preference) and nothing else. There's another guy in this thread who thinks we might not have any terminal values.
I'm no...
Part 1 was previously posted and it seemed that people likd it, so I figured that I should post part 2 - http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-2.html