This is from the friendly AI document:
Unity of will occurs when deixis is eliminated; that is, when speaker-dependent variables are eliminated from cognition. If a human simultaneously suppresses her adversarial attitude, and also suppresses her expectations that the AI will make observer-biased decisions, the result is unity of will. Thinking in the third person is natural to AIs and very hard for humans; thus, the task for a Friendship programmer is to suppress her belief that the AI will think about verself in the first person (and, to a lesser extent, think about herself in the third person).
Actually, thinking in the third person is unnatural to humans and computers. It's just that writing logic programs in the third person is natural to programmers. Many difficult representational problems, however, become much simpler when you use deictic representations. There's an overview of this literature in the book Deixis in Narrative: A cognitive science perspective (Duchan et al. 1995). For a shorter introduction, see A logic of arbitrary and indefinite objects.
Actually this may be a better link.
Part of the problem is that 3rd person representations have extensional semantics. If Mary Doe represents her knowledge about herself internally as a set of propositions about Mary Doe, and then meets someone else named Mary Doe, or marries John Deer and changes her name, confusion results.
A more severe problem becomes apparent when you represent beliefs about beliefs. If you ask, "What would agent X do in this situation?", and you represent agent X's beliefs using a 3rd-person representation, you have a lot of...
The Open Thread posted at the beginning of the month has exceeded 500 comments – new Open Thread posts may be made here.
This thread is for the discussion of Less Wrong topics that have not appeared in recent posts. If a discussion gets unwieldy, celebrate by turning it into a top-level post.