Example of somebody making that claim.
That's a 'circular' link to your own comment.
It seems to me a rational agent should never change its self-consistent terminal values. To act out that change would be to act according to some other value and not the terminal values in question.
It might decide to do that - if it meets another powerful agent, and it is part of the deal they strike.
That's a 'circular' link to your own comment.
It was totally really hard, I had to use a quine.
It might decide to do that - if it meets another powerful agent, and it is part of the deal they strike.
Is it not part of the agent's (terminal) value function to cooperate with agents when doing so provides benefits? Does the expected value of these benefits materialize from nowhere, or do they exist within some value function?
My claim entails that the agent's preference ordering of world states consists mostly in instrumental values. If an agent's value...
There's a recent science fiction story that I can't recall the name of, in which the narrator is traveling somewhere via plane, and the security check includes a brain scan for deviance. The narrator is a pedophile. Everyone who sees the results of the scan is horrified--not that he's a pedophile, but that his particular brain abnormality is easily fixed, so that means he's chosen to remain a pedophile. He's closely monitored, so he'll never be able to act on those desires, but he keeps them anyway, because that's part of who he is.
What would you do in his place?
In the language of good old-fashioned AI, his pedophilia is a goal or a terminal value. "Fixing" him means changing or erasing that value. People here sometimes say that a rational agent should never change its terminal values. (If one goal is unobtainable, the agent will simply not pursue that goal.) Why, then, can we imagine the man being tempted to do so? Would it be a failure of rationality?
If the answer is that one terminal value can rationally set a goal to change another terminal value, then either