There's a story about a card writing AI named Tully that really clarified the problem of FAI for me (I'd elaborate but I don't want to ruin it).
But there is no theoretical reason you can't have an AI that values universe-states themselves.
How would that work? How do you have a learner that doesn't have something equivalent to a reinforcement mechanism? At the very least it seems like there has to be some part of the AI that compares the universe-state to the desired-state and that the real goal is actually to maximize the similarity of those states which means modifying the goal would be easier than modifying reality.
And if it did have such a goal, why would it change it?
Agreed. I am trying to get someone to explain how such a goal would work.
How would that work?
Well that's the quadrillion dollar question. I have no idea how to solve it.
It's certainly not impossible as humans seem to work this way. We can also do it in toy examples. E.g. a simple AI which has an internal universe it tries to optimize, and it's sensors merely update the state it is in. Instead of trying to predict the reward, it tries to predict the actual universe state and selects the ones that are desirable.
Part 1 was previously posted and it seemed that people likd it, so I figured that I should post part 2 - http://waitbutwhy.com/2015/01/artificial-intelligence-revolution-2.html