Thought experiment: The transhuman pedophile

PhilGoetz

There's a recent science fiction story that I can't recall the name of, in which the narrator is traveling somewhere via plane, and the security check includes a brain scan for deviance. The narrator is a pedophile. Everyone who sees the results of the scan is horrified--not that he's a pedophile, but that his particular brain abnormality is easily fixed, so that means he's chosen to remain a pedophile. He's closely monitored, so he'll never be able to act on those desires, but he keeps them anyway, because that's part of who he is.

What would you do in his place?

In the language of good old-fashioned AI, his pedophilia is a goal or a terminal value. "Fixing" him means changing or erasing that value. People here sometimes say that a rational agent should never change its terminal values. (If one goal is unobtainable, the agent will simply not pursue that goal.) Why, then, can we imagine the man being tempted to do so? Would it be a failure of rationality?

If the answer is that one terminal value can rationally set a goal to change another terminal value, then either

any terminal value of a rational agent can change, or
we need another word for the really terminal values that can't be changed rationally, and a way of identifying them, and a proof that they exist.

What would you do in his place?

If the answer is that one terminal value can rationally set a goal to change another terminal value, then either

any terminal value of a rational agent can change, or
we need another word for the really terminal values that can't be changed rationally, and a way of identifying them, and a proof that they exist.

So, a terminological caveat first: I've argued elsewhere that in practice all values are instrumental, and exist in a mutually reinforcing network, and we simply label as "terminal values" those values we don't want to (or don't have sufficient awareness to) decompose further. So, in effect I agree with #2, except that I'm happy to go on calling them "terminal values" and say they don't exist, and refer to the real things as "values" (which depend to varying degrees on other values).

But, that being said, I'll keep using the phrase "terminal values" in its more conventional sense, which I mean approximately rather than categorically (that is, a "terminal value" to my mind is simply a value whose dependence on other values is relatively tenuous; an "instrumental value" is one whose dependence on other values is relatively strong, and the line between them is fuzzy and ultimately arbitrary but not meaningless).

All that aside... I don't really see what's interesting about this example.

So, OK, X is a pedophile. Which is to say, X terminally values having sex with children. And the question is, is it rational for X to choose to be "fixed", and if so, what does that imply about terminal values of rational agents?

Well, we have asserted that X is in a situation where X does not get to have sex with children. So whether X is "fixed" or not, X's terminal values are not being satisfied, and won't be satisfied. To say that differently, the expected value of both courses of action (fixed or not-fixed), expressed in units of expected moments-of-sex-with-children, is effectively equal (more specifically, they are both approximately zero).(1)

So the rational thing to do is choose a course of action based on other values.

What other values? Well, the example doesn't really say. We don't know much about this guy. But... for example, you've also posited that he doesn't get fixed because pedophilia is "part of who he is". I could take that to mean he not only values (V1) having sex with children, he also values (V2) being a pedophile. And he values this "terminally", in the sense that he doesn't just want to remain a pedophile in order to have more sex with children, he wants to remain a pedophile even if he doesn't get to have sex with children.

If I understand it that way, then yes, he's being perfectly rational to refuse being fixed. (Supposing that V2 > SUM (V3...Vn), of course.)

Alternatively, I could take that as just a way of talking, and assert that really, he just has V1 and not V2.

The difficulty here is that we don't have any reliable way, with the data you've provided, of determining whether X is rationally pursuing a valued goal (in which case we can infer his values from his actions) or whether X is behaving irrationally.

(1) Of course, this assumes that the procedure to fix him has a negligible chance of failure, that his chances of escaping monitoring and finding a child to have sex with are negligible, etc. We could construct a more complicated example that doesn't assume these things, but I think it amounts to the same thing.

What a compelling and flexible perspective. Relativistic mental architecture solve many conceptual problems.

I wonder why this comment is further down then when I'm not logged in.

17Luke_A_Somers13y

No, he terminally values being attracted to children. He could still assign a strongly negative value to actually having sex with children. Good fantasy, bad reality. Just like I strongly want to maintain my ability to find women other than my wife attractive, even though I assign a strong negative value to following up on those attractions. (one can construct intermediate cases that avoid arguments that not being locked in is instrumentally useful)

0CoffeeStain13y

I'm not sure that's a good place to start here. The value of sex is at least more terminal than the value of sex according to your orientation, and the value of pleasure is at least more terminal than sex. The question is indeed one about identity. It's clear that our transhumans, as traditionally notioned, don't really exclusively value things so basic as euphoria, if indeed our notion is anything but a set of agents who all self-modify to identical copies of the happiest agent possible. We have of course transplanted our own humanity onto transhumanity. If given self-modification routines, we'd certainly be saying annoying things like, "Well, I value my own happiness, persistent through self-modification, but only if its really me on the other side of the self-modification." To which the accompanying AI facepalms and offers a list of exactly zero self-modification options that fit that criterion.

8

Thought experiment: The transhuman pedophile

8

8

8

Thought experiment: The transhuman pedophile

8

8