I'm not sure how obvious the following is to people, and it probably is obvious to most of the people thinking about FAI. But just thought I'd throw out a summary of it here anyway, since this is the one topic that makes me the most pessimistic about the notion of Friendly AI being possible. At least one based heavily on theory and not plenty of experimentation.
A mind can only represent a complex concept X by embedding it into a tightly intervowen network of other concepts that combine to give X its meaning. For instance, a "cat" is playful, four-legged, feline, a predator, has a tail, and so forth. These are the concepts that define what it means to be a cat; by itself, "cat" is nothing but a complex set of links defining how it relates to these other concepts. (As well as a set of links to memories about cats.) But then, none of those concepts means anything in isolation, either. A "predator" is a specific biological and behavioral class, the members of which hunt other animals for food. Of that definition, "biological" pertains to "biology", which is a "natural science concerned with the study of life and living organisms, including their structure, function, growth, origin, evolution, distribution, and taxonomy". "Behavior", on the other hand, "refers to the actions of an organism, usually in relation to the environment". Of those words... and so on.
It does not seem likely that humans could preprogram an AI with a ready-made network of concepts. There have been attempts to build knowledge ontologies by hand, but any such attempt is both hopelessly slow and lacking in much of the essential content. Even given a lifetime during which to work and countless of assistants, could you ever hope to code everything you knew into a format from which it was possible to employ that knowledge usefully? Even a worse problem is that the information would need to be in a format compatible with the AI's own learning algorithms, so that any new information the AI learnt would fit seamlessly to the previously-entered database. It does not seem likely that we can come up with an efficient language of thought that can be easily translated into a format that is intuitive for humans to work with.
Indeed, there are existing plans for AI systems which make the explicit assumption that the AI's network of knowledge will develop independently as the system learns, and the concepts in this network won't necessarily have an easy mapping to those used in human language. The OpenCog wikibook states that:
Some ConceptNodes and conceptual PredicateNode or SchemaNodes may correspond with human-language words or phrases like cat, bite, and so forth. This will be the minority case; more such nodes will correspond to parts of human-language concepts or fuzzy collections of human-language concepts. In discussions in this wikibook, however, we will often invoke the unusual case in which Atoms correspond to individual human-language concepts. This is because such examples are the easiest ones to discuss intuitively. The preponderance of named Atoms in the examples in the wikibook implies no similar preponderance of named Atoms in the real OpenCog system. It is merely easier to talk about a hypothetical Atom named "cat" than it is about a hypothetical Atom (internally) named [434]. It is not impossible that a OpenCog system represents "cat" as a single ConceptNode, but it is just as likely that it will represent "cat" as a map composed of many different nodes without any of these having natural names. Each OpenCog works out for itself, implicitly, which concepts to represent as single Atoms and which in distributed fashion.
Designers of Friendly AI seek to build a machine with a clearly-defined goal system, one which is guaranteed to preserve the highly complex values that humans have. But the nature of concepts poses a challenge for this objective. There seems to be no obvious way of programming those highly complex goals into the AI right from the beginning, nor to guarantee that any goals thus preprogrammed will not end up being drastically reinterpreted as the system learns. We cannot simply code "safeguard these human values" into the AI's utility function without defining those values in detail, and defining those values in detail requires us to build the AI with an entire knowledge network. On a certain conceptual level, the decision theory and goal system of an AI is separate from its knowledge base; in practice, it doesn't seem like this would be possible.
The goal might not be impossible, though. Humans do seem to be pre-programmed with inclinations towards various complex behaviors which might suggest pre-programmed concepts to various degrees. Heterosexuality is considerably more common in the population than homosexuality, though this may have relatively simple causes such as an inborn preference towards particular body shapes combined with social conditioning. (Disclaimer: I don't really know anything about the biology of sexuality, so I'm speculating wildly here.) Most people also seem to react relatively consistently to different status displays, and people have collected various lists of complex human universals. The exact method of their transmission remains unknown, however, as does the role that culture serves in it. It also bears noting that most so-called "human universals" are actually cultural as opposed to individual universals. In other words, any given culture might be guaranteed to express them, but there will always be individuals who don't fit into the usual norms.
See also: Vladimir Nesov discusses a closely related form of this problem as the "ontology problem".
(Only slightly less briefly)
Human ontologies are complex, redundant and outright contradictory at times. Not only is some of it not needed to create an AI it would be counter-productive to include it.
When AI comes to model human preferences and the elements of the universe which are most relevant to fulfilling them they need not interfere at all with the implementation of the AI itself. It's just a complex form of data and metadata to keep in mind. When it comes to things that would more fundamentally influence the direct operation of the AI goals it can ensure that any alterations do not contradict the old version or do so only to resolve a discovered contradiction in whatever the sanest way possible is.
Humans suck compared to superintelligences. They even suck at knowing what they want. I'd rather tell a friendly superintelligence to do what I want it to do rather than try to program my goals into it. Did I mention that it is smarter than me? It can even emulate me and ask em-me my goals that way if it hasn't got a better option. There is no downside to getting the FAI to do it for me. If it isn't friendly then....
Humans suck at creating ontologies. Less than any other species I know but they still suck. I wouldn't include stupid parts in a FAI, that'd make it particularly hard to prove friendly. But it would naturally be able to look at humans and figure out any necessary stupid parts itself.
That is rather dense, I'll admit. But the gist of the reasoning is there.