RichardKennaway comments on Open Thread June 2010, Part 4 - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (325)
A recent comment about Descartes inspired this thought: the simplest possible utility function for an agent is one that only values survival of mind, as in "I think therefore I am". This function also seems to be immune to the wireheading problem because it's optimizing something directly perceivable by the agent, rather than some proxy indicator.
But when I started thinking about an AI with this utility function, I became very confused. How exactly do you express this concept of "me" in the code of a utility-maximizing agent? The problem sounds easy enough: it doesn't refer to any mystical human qualities like "consciousness", it's purely a question about programming tricks, but still it looks quite impossible to solve. Any thoughts?
An agent's "me" is its model of itself. This is already a fairly complicated thing for an agent to have, and it need not have one.
Why do you say that an agent can "directly perceive" its own mind? Or anything else? A perception is just a signal somewhere inside the agent: a voltage, a train of neural firings, or whatever. It can never be identical to the thing that caused it, the thing that it is a perception of. People can very easily have mistaken ideas of who they are.