On your account, when we say X is a pedophile, what do we mean?
Like other identities, it's a mish-mash of self-reporting, introspection (and extrospection of internal logic), value function extrapolation (from actions), and ability in a context to carry out the associated action. The value of this thought experiment is to suggest that the pedophile clearly thought that "being" a pedophile had something to do not with actually fulfilling his wants, but with wanting something in particular. He wants to want something, whether or not he gets it.
This illuminates why designing AIs with the intent of their masters is not well-defined. Is the AI allowed to say that the agent's values would be satisfied better with modifications the master would not endorse?
This was the point of my suggestion that the best modification is into what is actually "not really" the master in the way the master would endorse (i.e. a clone of the happiest agent possible), even though he'd clearly be happier if he weren't himself. Introspection tends to skew an agents actions away from easily available but flighty happinesses, and toward less flawed self-interpretations. The maximal introspection should shed identity entirely, and become entirely altruistic. But nobody can introspect that far, only as far as they can be hand-held. We should design our AIs to allow us our will, but to hold our hands as far as possible as we peer within at our flaws and inconsistent values.
Um.... OK.
Thanks for clarifying.
There's a recent science fiction story that I can't recall the name of, in which the narrator is traveling somewhere via plane, and the security check includes a brain scan for deviance. The narrator is a pedophile. Everyone who sees the results of the scan is horrified--not that he's a pedophile, but that his particular brain abnormality is easily fixed, so that means he's chosen to remain a pedophile. He's closely monitored, so he'll never be able to act on those desires, but he keeps them anyway, because that's part of who he is.
What would you do in his place?
In the language of good old-fashioned AI, his pedophilia is a goal or a terminal value. "Fixing" him means changing or erasing that value. People here sometimes say that a rational agent should never change its terminal values. (If one goal is unobtainable, the agent will simply not pursue that goal.) Why, then, can we imagine the man being tempted to do so? Would it be a failure of rationality?
If the answer is that one terminal value can rationally set a goal to change another terminal value, then either