As far as I am aware, people only resist changing their preferences because they don't fully understand the basis and value of their preferences and because they often have a confused idea of the relationship between preferences and personality.
Generally you should define your basic goals and change your preference to meet them, if possible. You should also be considering whether all your basic goals are optimal, and be ready to change them.
Yes, that’s the approach. The part I think is a problem for me is that I don’t know how to justify resisting an intervention that would change my preferences, if the intervention also changes the meta-preferences that apply to those preferences.
When I read the discussions here on AI self-modification, I think: why should the AI try to make its future-self follow its past preferences? It could maximize its future utility function much more easily by self-modifying such that its utility function is maximized in all circumstances. It seems to me that timeless decision theory advocates doing this, if the goal is to maximize the utility function.
I don’t fully understand my preferences, and I know there are inconsistencies, including acceptable ones like changes in what food I feel like eating today. If you have advice on how to understand the basis and value of my preferences, I’d appreciate hearing it.
I think you may be assuming that the person modifying your preferences is doing so both 'magically' and without reason.
I’m assuming there aren’t any side effects that would make me resist based on the process itself, so we can say that’s “magical”. Let’s say they’re doing it without reason, or for a reason I don’t care about, but they credibly tell me that they won’t change anything else for the rest of my life. Does that make a difference?
Of course, another issue may be that we are using 'preference' in different ways. You might find the act of killing puppies emotionally distasteful even if you know that it is necessary. It is an interesting question whether we should work to change our preferences to enjoy things like taking out the trash, changing diapers, and killing puppies.
I’m defining preference as something I have a positive or negative emotional reaction about. I sometimes equivocate with what I think my preferences should be, because I’m trying to convince myself that those are my true preferences. The idea of killing puppies was just an example of something that’s against my current preferences. Another example is “we will modify you from liking the taste of carrots to liking the taste of this other vegetable that tastes different but is otherwise identical to carrots in every important way.” This one doesn’t have any meta-preferences that apply.
I see that this conversation is in danger of splitting into different directions. Rather than make multiple different reply posts or one confusing essay, I am going to drop the discussion of AI, because that is discussed in a lot of detail elsewhere by people who know a lot more than I.
meta-preferences
We are using two different models here, and while I suspect that they are compatible, I'm going to outline mine so that you can tell me if I'm missing the point.
I don't use the term meta-preferences, because I think of all wants/preferences/rules/and gen...
If it's worth saying, but not worth its own post (even in Discussion), then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should be posted in Discussion, and not Main.
4. Open Threads should start on Monday, and end on Sunday.