Comment author: Alerus 06 May 2012 06:02:35PM *  0 points [-]

So it seems to me that the solution is use an expected utility function rather than a fixed utility function. Lets speak abstractly for the moment, and consider the space of all relevant utility functions (that is, all utility functions that would change the utility evaluate of an action). At each time step, we now will associate a probability of you transitioning from your current utility function to any of these other utility functions. For any given future state then, we can compute the expected utility. When you run your optimization algorithm to determine your action, what you therefore do is try and maximize the expected utility function, not the current utility function. So the key is going to wind up being assigning estimates to the probability of switching to any other utility function. Doing this in an entirely complete way is difficulty I'm sure, but my guess is that you can come to reasonable estimates that make it possible to do the reasoning.

Comment author: momothefiddler 07 May 2012 12:48:36AM 0 points [-]

I like this idea, but I would also, it seems, need to consider the (probabilistic) length of time each utility function would last.

That doesn't change your basic point, though, which seems reasonable.

The one question I have is this: In cases where I can choose whether or not to change my utility function - cases where I can choose to an extent the probability of a configuration appearing - couldn't I maximize expected utility by arranging for my most-likely utility function at any given time to match the most-likely universe at that time? It seems that would make life utterly pointless, but I don't have a rational basis for that - it's just a reflexive emotional response to the suggestion.

Comment author: TheOtherDave 06 May 2012 03:51:20PM 1 point [-]

I agree that if two things are indistinguishable in principle, it makes sense to use the same label for both.

It is not nearly as clear to me that "what makes me happy" and "what makes the world better" are indistinguishable sets as it seems to be to you, so I am not as comfortable using the same label for both sets as you seem to be.

You may be right that we don't use "happiness" to refer to the same things. I'm not really sure how to explore that further; what I use "happiness" to refer to is an experiential state I don't know how to convey more precisely without in effect simply listing synonyms. (And we're getting perilously close to "what if what I call 'red' is what you call 'green'?" territory, here.)

Comment author: momothefiddler 07 May 2012 12:37:13AM 0 points [-]

Without a much more precise way of describing patterns of neuron-fire, I don't think either of us can describe happiness more than we have so far. Having discussed the reactions in-depth, though, I think we can reasonably conclude that, whatever they are, they're not the same, which answers at least part of my initial question.

Thanks!

Comment author: TheOtherDave 06 May 2012 02:33:36PM 0 points [-]

What I mean by "sincerely" is just that I'm not lying when I assert it.
And, yes, this presumes that X isn't changing F.
I wasn't trying to be sneaky; my intention was simply to confirm that you believe F(Wa+X)>F(Wa) implies F(O(Wa+X))<F(O(Wa)), and that I hadn't misunderstood something.
And, further, to confirm that you believe that you believe that if F(W) gives the utility of a world-state for some evaluator, then F(O(W)) gives the degree to which that world-state makes that evaluator happy. Or, said more concisely: that H(O(W)) == F(O(W)) for a given observer.

Hm.

So, I agree broadly that F(Wa+X)>F(Wa) implies F(O(Wa+X))<F(O(Wa)). (Although a caveat: it's certainly possible to come up with combinations of F() and O() for which it isn't true, so this is more of an evidentiary implication than a logical one. But I think that's beside our purpose here.)

H(O(W)) = F(O(W)), though, seems entirely unjustified to me. I mean, it might be true, sure, just as it might be true that F(O(W)) is necessarily equal to various other things. But I see no reason to believe it; it feels to me like an assertion pulled out of thin air.

Of course, I can't really have any counterevidence, the way the claim is structured.

I mean, I've certainly had the experience of changing my mind about whether X makes the world better, even though observing X continues to make me equally happy -- that is, the experience of having F(Wa+X) - F(Wa) change while H(O(Wa+X)) - H(O((Wa)) stays the same -- which suggests to me that F() and H() are different functions... but you would presumably just say that I'm mistaken about one or both of those things. Which is certainly possible, I am far from incorrigible either about what makes me happy and I don't entirely understand what I believe makes the world better.

I think I have to leave it there. You are asserting an identity that seems unjustified to me, and I have no compelling reason to believe that it's true, but also no definitive grounds for declaring it false.

Comment author: momothefiddler 06 May 2012 02:54:30PM *  0 points [-]

I believe you to be sincere when you say

I've certainly had the experience of changing my mind about whether X makes the world better, even though observing X continues to make me equally happy -- that is, the experience of having F(Wa+X) - F(Wa) change while H(O(Wa+X)) - H(O((Wa)) stays the same

but I can't imagine experiencing that. If the utility of a function goes down, it seems my happiness from seeing that function must necessarily go down as well. This discrepancy causes me to believe there is a low-level difference between what you consider happiness and what I consider happiness, but I can't explain mine any farther than I already have.

I don't know how else to say it, but I don't feel I'm actually making that assertion. I'm just saying: "By my understanding of hedony=H(x), awareness=O(x), and utility=F(x), I don't see any possible situation where H(W) =/= F(O(W)). If they're indistinguishable, wouldn't it make sense to say they're the same thing?"

Edit: formatting

Comment author: Luke_A_Somers 06 May 2012 01:37:11PM 0 points [-]

Depends on a few things: Can you make the clones anencephalic, so you become neutral in respect to them? If you kill yourself, will someone else be conditioned in your place?

Comment author: momothefiddler 06 May 2012 02:44:44PM 0 points [-]

Well, I'm not sure making the clones anencephalic would make eating them truly neutral. I'd have to examine that more.

The linked situation proposes that the babies are in no way conscious and that all humans are conditioned, such that killing myself will actually result in a fewer number of people happily eating babies.

Comment author: Manfred 06 May 2012 08:12:24AM 0 points [-]

Okay. If you built a paperclip mazimizer, told the paperclip maximizer that you would probably change its utility function in a year or two, and offered it this choice, what would it do?

Comment author: momothefiddler 06 May 2012 01:23:41PM 0 points [-]

Refuse the option and turn me into paperclips before I could change it.

Apparently my acceptance that utility-function-changes can be positive is included in my current utility function. How can that be, though? While, according to my current utility function, all previous utility functions were insufficient, surely no future one could map more strongly onto my utility function than itself. Yet I feel that, after all these times, I should be aware that my utility function is not the ideal one...

Except that "ideal utility function" is meaningless! There is no overarching value scale for utility functions. So why do I have the odd idea that a utility function that changes without my understanding of why (a sum of many small experiences) is positive, while a utility function that changes with my understanding (an alien force) is negative?

There has to be an inconsistency here somewhere, but I don't know where. If I treat my future selves like I feel I'm supposed to treat other people, then I negatively-value claiming my utility function over theirs. If person X honestly enjoys steak, I have no basis for claiming my utility function overrides theirs and forcing them to eat sushi. On a large scale, it seems, I maximize for utilons according to each person. Let's see:

If I could give a piece of cake to a person who liked cake or to a person who didn't like cake, I'd give it to the former If I could give a piece of cake to a person who liked cake and was in a position to enjoy it or a person who liked cake but was about to die in the next half-second, I'd give it to the former If I could give a piece of cake to a person who liked cake and had time to enjoy the whole piece or to a person who liked cake but would only enjoy the first two bites before having to run to an important even and leaving the cake behind to go stale, I'd give it to the former If I could (give a piece of cake to a person who didn't like cake) or (change the person to like cake and then give them a piece of cake) I should be able to say "I'd choose the latter" to be consistent, but the anticipation still results in consternation. Similarly, if cake was going to be given and I could change the recipient to like cake or not, I should be able to say "I choose the latter", but that is similarly distressing. If my future self was going to receive a piece of cake and I could change it/me to enjoy cake or not, consistency would dictate that I do so.

It appears, then, that the best thing to do would be to make some set of changes in reality and in utility functions (which, yes, are part of reality) such that everyone most-values exactly what happens. If the paperclip maximizer isn't going to get a universe of paperclips and is instead going to get a universe of smiley faces, my utility function seems to dictate that, regardless of the paperclip maximizer's choice, I change the paperclip maximizer (and everyone else) into a smiley face maximizer. It feels wrong, but that's where I get if I shut up and multiply.

Comment author: TheOtherDave 06 May 2012 05:55:31AM *  1 point [-]

(nods) Cool.

As for your proposed definition of happiness... hm.

I have to admit, I'm never exactly sure what people are talking about when they talk about their utility functions. Certainly, if I have a utility function, I don't know what it is. But I understand it to mean, roughly, that when comparing hypothetical states of the world Wa and Wb, I perform some computation F(W) on each state such that if F(Wa) > F(Wb), then I consider Wa more valuable than Wb.

Is that close enough to what you mean here?

And you are asserting, definitionally, that if that's true I should also expect that, if I'm fully aware of all the details of Wa and Wb, I will be happier in Wa.

Another way of saying this is that if OW is the reality that I would perceive in a world W, then my happiness in Wa is F(OWa). It simply cannot be the case, on this view, that I consider a proposed state-change in the world to be an improvement, without also being such that I would be made happier by becoming aware of that state-change actually occurring.

Am I understanding you correctly so far?

Further, if I sincerely assert about some state change that I believe it makes the world better, but it makes me less happy, it follows that I'm simply mistaken about my own internal state... either I don't actually believe it makes the world better, or it doesn't actually make me less happy, or both.

Did I get that right? Or are you making the stronger claim that I cannot in point of fact ever sincerely assert something like that?

Comment author: momothefiddler 06 May 2012 12:54:10PM 0 points [-]

I understand it to mean, roughly, that when comparing hypothetical states of the world Wa and Wb, I perform some computation F(W) on each state such that if F(Wa) > F(Wb), then I consider Wa more valuable than Wb.

That's precisely what I mean.

Another way of saying this is that if OW is the reality that I would perceive in a world W, then my happiness in Wa is F(OWa). It simply cannot be the case, on this view, that I consider a proposed state-change in the world to be an improvement, without also being such that I would be made happier by becoming aware of that state-change actually occurring.

Yes

Further, if I sincerely assert about some state change that I believe it makes the world better, but it makes me less happy, it follows that I'm simply mistaken about my own internal state... either I don't actually believe it makes the world better, or it doesn't actually make me less happy, or both. Did I get that right? Or are you making the stronger claim that I cannot in point of fact ever sincerely assert something like that?

Hm. I'm not sure what you mean by "sincerely", if those are different. I would say if you claimed "X would make the universe better" and also "Being aware of X would make me less happy", one of those statements must be wrong. I think it requires some inconsistency to claim F(Wa+X)>F(Wa) but F(O(Wa+X))<F(O(Wa)) - I changed the notation slightly, let me know if that doesn't make sense. Although! If X includes a change to F, I must additionally stipulate that the Fs must match - it's perfectly valid to say F1(Wa+X)<F1(Wa) but F2(O(Wa+X))>F2(O(Wa)), which is relatively common (Pascal's Wager comes to mind).

Comment author: TheOtherDave 06 May 2012 03:53:21AM 0 points [-]

Cool. I thought it was confusing you earlier, but perhaps I misunderstood.

Comment author: momothefiddler 06 May 2012 04:00:58AM 0 points [-]

It was confusing me, yes. I considered hedons exactly equivalent to utilons.

Then you made your excellent case, and now it no longer confuses me. I revised my definition of happiness from "reality matching the utility function" to "my perception of reality matching the utility function" - which it should have been from the beginning, in retrospect.

I'd still like to know if people see happiness as something other than my new definition, but you have helped me from confusion to non-confusion, at least regarding the presence of a distinction, if not the exact nature thereof.

Comment author: Dorikka 06 May 2012 03:41:12AM 1 point [-]

That depends: how much do you (currently) value the happiness of your future self versus the life-experience of the expected number of babies you're going to kill? If possible, it would probably be optimal to take measures that would both make your future self happy and not-kill babies, but if not, the above question should help you make your decision.

Comment author: momothefiddler 06 May 2012 03:54:05AM 0 points [-]

Well, the situation I was referencing assumed baby-eating without the actual sentience at any point of the babies, but that's not relevant to the actual situation. You're saying that my expected future utility functions, in the end, are just more values in my current function?

I can accept that.

The problem now is that I can't tell what those values are. It seems there's a number N large enough that if N people were to be reconfigured to heavily value a situation and the situation was then to be implemented, I'd accept the reconfiguration. This was counterintuitive and, due to habit, feels it should still be, but makes a surprising amount of sense.

Comment author: TheOtherDave 06 May 2012 03:30:34AM 1 point [-]

I'm not sure how "hedons" interact with "utilons".
I'm not saying anything at all about how they interact.
I'm merely saying that they aren't the same thing.

Comment author: momothefiddler 06 May 2012 03:45:15AM 0 points [-]

Oh! I didn't catch that at all. I apologize.

You've made an excellent case for them not being the same. I agree.

Comment author: TheOtherDave 06 May 2012 03:20:32AM 0 points [-]

I'm saying that something can make the world better without affecting me, but nothing can make me happier without affecting me. That suggests to me that the set of things that can make the world better is different from the set of things that can make me happy, even if they overlap significantly.

Comment author: momothefiddler 06 May 2012 03:26:29AM 0 points [-]

That makes sense. I had only looked at the difference within "things that affect my choices", which is not a full representation of things. Could I reasonably say, then, that hedons are the intersection of "utilons" and "things of which I'm aware", or is there more to it?

Another way of phrasing what I think you're saying: "Utilons are where the utility function intersects with the territory, hedons are where the utility function intersects with the map."

View more: Prev | Next