The best 15 words

apophenia

What I can't figure out is whether you're suggesting that I'm ethically confused... that it simply isn't true that I ought to do those things, and if I understood the world better it would stop seeming to me that I ought to do them... or if I'm simply not being correctly described by your "we" statements and you're unjustifiedly generalizing from your own experience

None of the above. I'm just trying to figure out why my intuition says that I do not want not block all negative affect and whether my intuition is wrong, and your objections are helping me to so. I've got no idea whether we're fundamentally different, or whether one of us is wrong - I'm just verbally playing with the space of ideas with you. The things I'm saying right now are exploratory thoughts and could easily be wrong - the hope is that value comes out of it.

"We" is just a placeholder for humans. I'm making the philosophical claim that negative affect is the real-life, non-theoretical thing that corresponds to the game-theory construct of negative utility, with some small connotative differences.

None of that seems compatible with the idea that what I actually negatively value is the pain of thinking about other people suffering.

No, of course not. Here's what I'm suggesting: Thinking about other people's suffering causes the emotion "concern" (a negative emotion) which is in fact "negative utility". If you don't feel concern when faced with the knowledge that someone is in pain, it means that you don't experience "negative utility" in response to other people being in pain. I'm suggesting the fact that you negatively value people to be in pain is inextricably linked to the emotions you feel when people are in pain. I'm suggesting that If you remove concern (as occurs in real-world sociopathy) you won't have any intrinsic incentive to care about the pain of others anymore.

(Not "you" in particular, but animals in general.)

Basically, when modelling a real world object as an agent, we should consider whatever mechanism causes the neural circuits (or whatever the being is made of) that cause it to take action as indicative of "utility". In humans, the neural pattern "concern" causes us to take action when others suffer, so "concern" is negative utility in response to suffering. (This gets confusing when agents don't act in their interests, but if we want to nitpick about things like that we shouldn't be modelling objects as agents in the first place)

Here's a question: Do you think we have moral responsibilities to AI? Is it immoral to cause a Friendly AI to experience negative utility by fooling it into thinking bad things are happening and then killing it? I think the answer might be yes - since the FAI shares many human values, I think I consider it a person. It makes sense to treat negative utility for the FAI as analogous to human negative affect.

If it's true that negative affect and negative utility are roughly synonymous, it's impossible to make a being that negatively values torture and doesn't feel bad when seeing torture.

But maybe we can work around this...maybe we can get a being which experiences positive affect from preventing torture, rather than negative affect from not preventing torture. Such a being has an incentive to prevent torture, yet doesn't feel concerned when torture happens.

Either way though - if this line of thought makes sense, you can't have a human which is constantly experiencing maximum positive affect, because that human would never have an incentive to act at all.

A rational agent makes decisions by imagining a space of hypothetical universes and picking the one it prefers using its actions. How should I choose my favorite out of these hypothetical universes? It seems to involve simulating the affective states that I would feel in each universe. But this model breaks down if I put my own brain in these universes, because then I will just pick the universe that maximize my own affective states. I've got to treat my brain as a black box. Once you start tinkering with the brain, decision theory goes all funny.

Edit: Affective states don't have to roughly correspond to utility. If you're a human, positive utility is "good". you're a paperclipper, positive utility is "paperclippy". It's just that human utility is affective states.

If you alter the affective states, you will alter behavior (and therefore you alter "utility"). This does not mean that the affective state is the thing which you value - it means that for humans the affective state is the hardware that decides what you value.

(again, not you per se. I should probably get out of the habit of using "you").

16

The best 15 words

16

16

16

The best 15 words

16

16