2. Premise two: Some cases of value change are (il)legitimate

Nora_Ammann

The Value Change Problem (sequence)

24 2. Premise two: Some cases of value change are (il)legitimate

by Nora_Ammann

26th Oct 2023

AI Alignment Forum

7 min read

24 Ω 9

In the prior post, I have defended the claim that genuine value change is possible, and thus, that a realistic account of human values understands them to be malleable. In this section, I will argue for the claim that some cases of value change are legitimate while others are illegitimate. In other words, I argue that, at least in principle, something substantial is to be said about what types of value change are legitimate vs. illegitimate. Let us call this the ‘value change legitimacy’ claim (VCL). [1]

To do so, I first explain what I mean by value change legitimacy. Then, I make an appeal to intuition or common sense by providing a handful of examples that, I expect, most people would not hesitate to accept as examples of both legitimate and illegitimate cases of value change. Finally, I suggest a plausible evaluative criteria identifying (il)legitimate value change, which provides further rational grounding for the common sense intuitions invoked earlier, as well as a starting point for developing a comprehensive account of value change legitimacy.

([1]: To clarify the scope of the VCL claim, let me briefly clarify what I am not trying to make claims about. First, I am not trying to make claims about whether all cases of value change are either legitimate or illegitimate (i.e., whether legitimacy/illegitimacy is a collectively exhaustive way to classify cases of value change). Second, I don't mean to exclude the possibility that legitimacy comes in degrees, or that there might exist grey areas with respect to whether a given case is (il)legitimate.)

Clarifying the notion of value change legitimacy

First, and maybe most important, value change legitimacy, as I mean to propose it here, is a procedural notion. In other words, in asking about value change legitimacy, I am asking about the way in which a given value change has come about. This is in contrast to asking whether the value change as such is morally good or bad. As we have seen in the former post, the latter question is confusing (and maybe confused) because the value change itself implies a change of the evaluative framework. As a result, it is unclear on what basis the goodness or badness of the value change as such should be evaluated. However, I claim that there still exist morally relevant differences between different cases of value change (other than the moral status of the value change as such) -- in particular, the procedural question: whether the manner in which the value change has come about conforms with certain normative standards that make the change acceptable, unproblematic and thus legitimate, or—on the other hand—objectionable, problematic and thus illegitimate.

(The choice of the world ‘legitimacy’ might seem confusing or unfortunate to some. While I am happy to hear better suggestions, it seems worth clarify that I chose this term with reference to how ‘legitimacy’ is typically used in political philosophy, where it refers to procedural properties of political institutions. According to me, this analogy goes surprisingly deep—but will have to leave exploring this in more detail to another time.)

Of course, in practice, it may not always be clear whether specific cases of value change are legitimate or not, or how, in general, we ought to decide on what counts as legitimate vs. illegitimate. In fact, these questions will be subject to disagreement and rational deliberation. For the purposes of the argument in favour of the Value Change Problem, it suffices for me to establish that there is something substantive to be said about the difference between legitimate and illegitimate cases’ value change—that the difference exists—even if important questions remain about how exactly to do so.

That said, I will in the latter part of this post put forth a specific proposal as to what we might mean by legitimacy--namely, the degree of self-determination involved in the process of value change. While I do believe value self-determination is a critical aspect of value change legitimacy, I do not think my proposal provides close to a comprehensive account of value change legitimacy, able to deal satisfactorily with a wide range of practical intricacies that arise. For example, for future work, I am interested in ‘stress testing’ my current account of legitimacy by looking at cases in, e.g., parenting and education, and using the resulting insights to build on and improve the current, provisoinal account. As such, I suggest the proposal put forth below to be understood in the spirit of wanting to provide a productive starting point, rather than as an end point.

The case for value change legitimacy

Argument form intuition/common sense

Having clarified what we mean by legitimacy in the context of value change, let us now explore the case for VCL.

I will start by describing two examples that I believe people will widely agree represent cases of both legitimate and illegitimate value change--that is, defending VCL by providing an existence proof of sorts.

As such, let us consider the following examples of value change.

First, consider David. David does not currently hold a deep appreciation for jazz. For example, when he recently accompanied his friend to a concert—herself an ardent jazz lover—, he secretly fell asleep for some parts of the performance due to boredom. However, Daniel has an inkling that there may be something deeply valuable about jazz that he has not yet come to fully apprehend. This motivates him to spend several weeks attentively listening to jazz music and going to more concerts. While he initially struggles to pay attention, over time, his experience of the music starts to change until, eventually, Daniel comes to deeply appreciate jazz, just like his friend does.

On the other hand, consider Elsa. Elsa, too, does not initially have an appreciation jazz and also comes to love it. In her case, however, the change is the result of Elsa joining a cult which, as a central pillar of their ideology, venerate a love of jazz. The cult makes use of elaborate means of coercive persuasion, involving psychological techniques as well as psychoactive substances, in order to get all of their members to appreciate jazz.

Each of these are cases of value change as characterised earlier. However, and I would argue most people would agree, there are morally significant differences between these cases: while Daniel's case appears (largely) unproblematic and legitimate, Elsa's one appears (largely) problematic and illegitimate. To put it another way, it seems to me like we would lose something important if we were to deny that there are morally relevant difference between these cases which are not reducible to the nature of the value change (in this case the love for jazz). We want to be able to point at Elsa’s case of value change and argue that it is problematic and should be prevented, and we want to be able to say that Daniel’s case of value change is fine and does not need to be prevented, without in either case basing our argumentation on whether or not loving jazz is a morally acceptable or not. As such, I argue that the relevant difference we are picking up on here pertains to the legitimacy (or lack thereof) of the value change process (in the sense I've described it above).

So far, so good. But, beyond appealing to common sense, can we say anything substantive about what makes these cases different?

Argument from plausible mechanism

I suggest that a/the key difference between Daniel's and Elsa's examples lies in the process by which the value change has been brought about, in particular in the extent to which the process was self-determined by the person who undergoes the change, and the extent to which the person remains able to ‘course-correct’ the unfolding of the process (e.g., slow, halt or redirect) if she so chooses to.

To illustrate this, let's first consider Daniel's case. This case of value change appears unproblematic—a case of legitimate value change—in that the transformational process occurs at Daniel's own, free volition, and at any point, he could have chosen to discontinue to further engage in said aspirational process. His friend did not force him to engage with jazz; rather, Daniel held proleptic reasons^[1] for engaging more with jazz—an inkling, so to speak, for what would later turn into his full capacity to value jazz. By contrast, Elsa’s ability to engage in the unfolding of her value transformation freely and in a self-determined fashion was heavily undermined by the nature of the process. Even if she might have chosen the first (few) interactions with the cult freely, the cult’s sophisticated use of methods of manipulation, indoctrination or brainwashing deliberately exploit Elsa’s psychological make-up. As such, the resulting change—independent of what specific beliefs and values it results in—is problematic due to the way it was brought about, and as such support our intuition that this is a case of illegitimate value change.

To test this idea slightly more, let's consider the case of Finley who, just like Daniel and Elsa also ends up falling in love with jazz. In Finley's case, they find themselves, a result of the workings of a content recommender system, consuming a lot of videos about the joys of jazz. Starting out, Finley did not hold any particular evaluative stance towards jazz; a few weeks later, however, they become obsessed with it, started to frequent concerts, read books on jazz, and so on.

Compared to Daniel and Elsa, I think Finley's case is more subtle and ambiguous with respect to value change legitimacy. On one hand, Finley’s process of value change does not meet the same level of active and self-determined engagement as Daniel's. Finley did not (by stipulation in the example) start out with an inkling for the value of gardening, as is characteristic for an aspirational process according to Callard. Rather, they were passively exposed to information which then brought about the change. Furthermore, the recommendation algorithm arguably is shaped more by the economic incentives of the company than it is with the primary purpose of exposing Finley to new experiences and perspectives in mind. Finally, content recommendation platforms have some potential to cause compulsive or addictive behaviour in consumers by exploiting the human psychological make-up (e.g., sensitivity to dopamine stimuli). All of these factors can be taken to weaken Finley’s ability to reflect on their current level of jazz video consumption and values, and to ‘course-correct’ if they wanted to. At the same time, while the recommender platform might be said to have weakened Finley's ability to self-determine and course-correct the process autonomously, this occurred at a very different level as the coercive persuasion experienced by Elsa. As such, this case appears neither clearly legitimate nor clearly illegitimate, but carries some aspects of both.

This is to show, referring to self-determination alone does not clarify all we need to know about what does and does not constitute legitimate value change. As mentioned above, in future work, I am interested in stress testing and building on this preliminary account further.

^{^}
Callard (2016) defines proleptic reasons as reasons which are based on value estimates which the reasoner cannot fully access yet, even if they might be able to partially glean them, i.e. an "inchoate, anticipatory, and indirect grasp of some good" (2016, p. 132).

World Modeling

Frontpage

24 Ω 9

1. Premise one: Values are malleable

1 comments21 karma

3. Premise three & Conclusion: AI systems can affect value change trajectories & the Value Change Problem

4 comments28 karma

Mentioned in

35'Theories of Values' and 'Theories of Agents': confusions, musings and desiderata

320. The Value Change Problem: introduction, overview and motivations

2. Premise two: Some cases of value change are (il)legitimate

New Comment

7 comments, sorted by

top scoring

Click to highlight new comments since: Today at 5:40 PM

[-]Algon2y20

On the other hand, consider Elsa. Elsa, too, does not initially have an appreciation jazz and also comes to love it. In her case, however, the change is the result of Elsa joining a cult which, as a central pillar of their ideology, venerate a love of jazz. The cult makes use of elaborate means of coercive persuasion, involving psychological techniques as well as psychoactive substances, in order to get all of their members to appreciate jazz.

I think this is an unfortunate example, as cults are quite ineffectual at retaining people (1% retention rates are good, for a cult!). Adressing the core point, I think people overstate how bad value-shifts are, as we humans implicitly accept them all the time whenever we move to a new social group. In some sense, we hold the values of a group kind of lightly, as a sort of mask. But because inauthentically wearing a mask fares worse under social scrutiny than becoming the mask, we humans will really make our social group's values a deep part of us. And that's fine! But it makes it tricky to disentangle what sorts of value changes are, or are not, legitimate.

(Going on a rant now because I couldn't resist).

And such shifts certainly exist! Like, if you don't think a tamping iron spikes through my skull and causes my personality to radically shift is an illegitimate value change, then I've got some brain surgery I want to try out on you.

Which suggests a class of value-changes that we might think of as illegitimate: shifts caused by a cartesian-boundry violating event. If something re-arranges the insides of my skull, that's murder. And if it ain't, it is an illegitmate value shift. If some molecular system slips past my blood-brain barrier and causes my reward centers to light up like a firework, well, that's probably Heroin. And it is violating my boundry, which means it is causing an illegitimate value shift. And so on.

But wait! What about drugs like selective seratonin-uptake inhibitors i.e. SSRIs? Taking that can cause a value shift, but if you deny that it is legitimate, then I hope you never become depressed. So maybe voluntarily taking these things is what matters?

But wait! What if you are unware of the consequences of taking the medication? For instance, for the average depressed person, it either does nothing or cures their depression. But for you, it gives you Schizophrenia, because we're in a world beyond the reach of god. Well then, that sounds like an illegitimate value shift.

So maybe the problem is that the changes are predictably not sanctioned by us ahead of time? Well then, what about something like the Gandhi-murder pill? You can take a pill which (additively) makes you 1% more like a mass murderer but gives you $1 milllion in exchange. If you take the pill now, you're more likely to take such pills in the future, driving you down a slippery slop to evil. So maybe you, I don't know, make a legal agreement to restrict your future self's actions.

But then you wind up with your future self disagreeing with your current self about what they're allowed to do, whilst you delberately and knowingly put yourself into that situation. Is that legitimate? I don't know.

This is to show, referring to self-determination alone does not clarify all we need to know about what does and does not constitute legitimate value change. As mentioned above, in future work, I am interested in stress testing and building on this preliminary account further.

I am looking forward to it. I don't think your post updated me, though I didn't read it carefully, but I am glad someone is talking about this. This is a serious problem that we have to solve to deal with alignment, and I think, to convince (some) people that there is some grounds to saying we should try to "align" AI at all. We can simultaenously tolerate a very wide space of values and say that no, going outside of those values is not OK, neither for us nor our descendants. And that such a position is just common sense.

Or maybe you'll find out that no, people who believe that are deluding themselves, in which case I'm eager to hear your arguements.

[-]micahcarroll2y10

saying we should try to "align" AI at all.

What would be the alternative?

We can simultaenously tolerate a very wide space of values and say that no, going outside of those values is not OK, neither for us nor our descendants. And that such a position is just common sense.

Is this the alternative you're proposing? Is this basically saying that there should be ~indifference between many induced value changes, within some bounds of acceptability? I think clarifying the exact bounds of acceptability is quite hard, and anything that's borderline might lead to increased chance of values drifting to "non-acceptable" regions.

Also, common sense has changed dramatically over centuries, so it seems hard to ground these kinds of notions entirely in common sense too.

[-]Algon2y30

What would be the alternative?

I'm not quite sure. Some people react to the idea of imbuing AI with some values with horror ("that's slavery!" or "you're forcing the AI to have your values!") and I'm a little empathetic but also befuddled about what else to do. When you make these things, you're implicitly making some choice about how to influence what they value.

Is this the alternative you're proposing? Is this basically saying that there should be ~indifference between many induced value changes, within some bounds of acceptability? I think clarifying the exact bounds of acceptability is quite hard, and anything that's borderline might lead to increased chance of values drifting to "non-acceptable" regions

No, I was vaguely describing at a high-level what value-change policy I endorse. As you point out, clarifying those bounds is very hard, and very important.

Likewise, I think "common sense" can change in endrosed ways, but I think we probably have a better handle on that as correct reasoning is a much more general, and hence simple, sort of capacity.

[-]micahcarroll2yΩ010

As we have seen in the former post, the latter question is confusing (and maybe confused) because the value change itself implies a change of the evaluative framework.

I’m not sure which part of the previous post you’re referring to actually – if you could point me to the relevant section that would be great!

[-]Nora_Ammann2yΩ121

yes, sorry! I'm not making it super explicit, actually, but the point is that, if you read e.g. Paul or Callard's accounts of value change (via transformative experiences and via aspiration respectively), a large part of how they even set up their inquiries is with respect to the question whether value change is irrational or not (or what problem value change poses to rational agency). The rationality problem comes up bc it's unclear from what vantage point one should evaluate the rationality (i.e. the "keeping with what expected utiltiy theory tells you to do") of the (decision to undergo) value change. From the vantage point of your past self, it's irrational; from the vantage point of your new self (be it as parent, vampire or jazz lover), it may be rational.

Form what I can tell, Paul's framing of transformative experiences is closer to "yes, transformative experiences are irrational (or a-rational) but they still happen; I guess we have to just accept that as a 'glitch' in humans as rational agents"; while Callard's core contribution (in my eyes) is her case for why aspiration is a rational process of value development.

[-]ProgramCrafter2yΩ010

We want to be able to point at Elsa’s case of value change and argue that it is problematic and should be prevented, and we want to be able to say that Daniel’s case of value change is fine and does not need to be prevented, without in either case basing our argumentation on whether or not loving jazz is a morally acceptable or not. As such, I argue that the relevant difference we are picking up on here pertains to the legitimacy (or lack thereof) of the value change process (in the sense I've described it above).

Is it really the relevant difference?

I think that there could be cases of acceptable illegitimate value change; that is, if both current I and I-as-in-CEV (in the future, knowing more, etc) would endorse the change, but it were done without a way to course-correct it. Metaphor: imagine you had to walk over a hanging bridge so that you couldn't stop in the middle at risk of injury.

So, in my opinion legitimacy can be based on nature of value change only, but acceptability is also based on the opinion of person in question.

[-]Nora_Ammann2yΩ111

Yeah interesting point. I do see the pull of the argument. In particular the example seems well chosen -- where the general form seems to be something like: we can think of cases where our agent can be said to be better off (according to some reasonable standards/form some reasonable vantage point) if the agent can make themselves be committed to continue doing a thing/undergoing a change for at least a certain amount of time.

That said, I think there are also some problems with it. For example, I'm wary of reifying "I-as-in-CEV" more than what is warranted. For one, I don't know whether there is a single coherent "I-as-in-CEV" or whether there could be several; for two, how should I apply this argument practically speaking given that I don't know what "I-as-in-CEV" would consider acceptable.

I think there is some sense in which proposing to use legitimacy as criterion has a flavour of "limited ambition" - using it will in fact mean that you will sometimes miss out of making value changes that would have been "good/acceptable" from various vantage points (e.g. legitimacy would say NO to pressing the button that would magically make everyone in the world peaceful/against war (unless the button involves some sophisticated process that allows you to back out legitimacy for everyone involved)). At the same time, I am wary we cannot give up on legitimacy without risking much worse fates, and as such, I feel currently fairly compelled to opt for legitimacy form an intertemporal perspective.

Moderation Log