Here is a more complex variant that I can't see how to dismiss easily.
If you can build a "predict how humans feel in situation x" function, you can do some interesting things. Lets call this function feel(x). Now as well as first order happiness, you can also predict how they will feel when told about situation X, so feel("told about X").
You might be able to recover something like preference if you can calculate feel("the situation where X is suggested and, told about feel(X) and told about all other possible situations") , for all possible situations, as long as you can rank the output of feel(X) in some way.
Well as long as the human simulator predictor can cope with holding in all possible situations, and not return "worn out" for all situations.
Anyway it is an interesting riff off the idea. Anyone see any holes that I am missing?
Try to figure out what maximizes this estimate method. It won't be anything you'd want implemented, it will be a wireheading stimulus. Plus, FAI needs to valuate (and work to implement) whole possible worlds, not verbal descriptions. And questions about possible worlds involve quantifies of data that a mere human can't handle.
This post enumerates texts that I consider (potentially) useful training for making progress on Friendly AI/decision theory/metaethics.
Rationality and Friendly AI
Eliezer Yudkowsky's sequences and this blog can provide solid introduction to the problem statement of Friendly AI, giving concepts useful for understanding motivation for the problem, and disarming endless failure modes that people often fall into when trying to consider the problem.
For a shorter introduction, see
Decision theory
The following book introduces an approach to decision theory that seems to be closer to what's needed for FAI than the traditional treatments in philosophy or game theory:
Another (more technical) treatment of decision theory from the same cluster of ideas:
Following posts on Less Wrong present ideas relevant to this development of decision theory:
Mathematics
The most relevant tool for thinking about FAI seems to be mathematics, where it teaches to work with precise ideas (in particular, mathematical logic). Starting from a rusty technical background, the following reading list is one way to start:
[Edit Nov 2011: I no longer endorse scope/emphasis, gaps between entries, and some specific entries on this list.]