Comment Permalink

Dorikka13y00

I'm suspicious of the implied claim that the 'change in sustained happiness over time' term is so large in the relevant utility calculation that it dominates other terminal values.

No -- liking sugar crashes would cause me to have more sugar crashes, and I'm not nearly as productive during sugar crashes as otherwise. So if I evaluated the new situation with my current utility function, I would find increased happiness (which is good), and very decreased productivity (which is more bad than the happiness is good). So, to clarify, liking sugar crashes would be significantly worse than what I have now, because I value other things than pleasure.

I kinda suspect that you would have the same position -- modifying other sentiences' utility functions in order to maximize happiness, but evaluating changes to your own utility function with your current utility function. One of the more obvious problems with this asymmetry is that if we had the power to rewire each other's brain, we would be in conflict -- each would, in essence, be hostile to the other, even though we would consider our intentions benevolent.

However, I'm unsatisfied with the 'evaluate your proposed change to someone's utility function with their CEV'd current utility function', because quite a bit is relying on the 'CEV' bit. Let's say that someone was a heroin addict, and I could rewire them to remove their heroin addiction (so that it's the least-convenient-possible-world, let's say that I can remove the physical and mental withdrawal as well). I'm pretty sure that their current utility function (which is super-duper time discounted -- one of the things heroin does) would significantly oppose the change, but I'm not willing to stop here, because it's obviously a good thing for them.

So the question becomes 'what should I actually do to their current utility function to CEV it, so I can evaluate the new utility function with it.' Well, first I'll strip the actual cognitive biases (including the super-time-discounting caused by the heroin) -- then I'll give it as much computing power as possible so that it can reasonably determine the respective utility and probability of different world-states if I change the utility function to remove the heroin addiction. If I could do this, I would be comfortable with applying this solution generally.

If someone's bias-free utility function running on an awesome supercomputer determined that the utility of you changing their utility function in the way you intend was negative, would you still think it was the right thing to do? Or should we consider changing someone's utility function without their predicted consent only desirable to the extent that their current utility function is biases and has limited computing power? (Neglecting, of course, effects upon other sentiences that the modification would cause.)

momothefiddler13y00

I can't figure out an answer to any of those questions without having a way to decide which utility function is better. This seems to be a problem, because I don't see how it's even possible.

See in context

5 Delayed Gratification vs. a Time-Dependent Utility Function

by momothefiddler

6th May 2012

4 min read

5

Ideally, a utility function would be a rational, perfect, constant entity that accounted for all possible variables, but mine certainly isn't. In fact, I'd feel quite comfortable claiming that no humans at the time of writing do.

When confronted with the fact that my utility function is non-ideal or - since there's no universal ideal to compare it to - internally inconsistent, I do my best to figure out what to change and do so. The problem with a non-constant utility function, though, is that it makes it hard to maximise total utility. For instance, I am willing to undergo -50 units of utility today in return for +1 utility on each following day indefinitely. What if I accept the -50, but then my utility function changes tomorrow such that I now consider the change to be neutral, or worse, negative per day?

Just as plausible is the idea that I be offered a trade that, while not of positive utility according to my function now, will be according to a future function. Just as I would think it a good investment to buy gold if I expected the price to go up but bad if I expected the price to go down, so I have to base my long-term utility trades on what I expect my future functions to be. (Not that dollars don't correlate with units of utility, just that they don't correlate strongly.)

How can I know what I will want to do, much less what I will want to have done? If I obtain the outcome I prefer now, but spend more time not preferring it, does that make it a negative choice? Is it a reasonable decision, in order to maximise utility, to purposefully change your definition of utility such that your expected future would maximise it?

What brings this all to mind is a choice I have to make soon. Technically, I've already made it, but I'm now uncertain of that choice and it has to be made final soon. This fall I transfer from my community college to a university, where I will focus a significant amount of energy studying Something 1 in order to become trained (and certified) to do Something 2 for a long period of time. I had thought until today that it was reasonable for Something 1 to be math and Something 2 to be teaching math. I enjoy the beauty of mathematics. I love how things fit together, barely anything can excite me as much as the definition of a derivative and its meaning, and I've shown myself to be rather good at it (which, to be fair, is by comparison to those around me, so I don't know how I'd fare in a larger or more specialized pool). In addition, I've spent some time as a tutor and I seem to be good at explaining mathematics to other people and I enjoy seeing their faces light up as they see how things fit together.

Today, though, I don't know if that's really a wise decision. I was rereading Eliezer's paper on AI in Global Risk and was struck by a line: "If we want people who can make progress on Friendly AI, then they have to start training themselves, full-time, years before they are urgently needed." It occurred to me that I think FAI is possible and that I expect some sort of AI within my lifetime (though I don't expect that to be short). Perhaps I'd be happier studying topology than I would cognitive science and I'd definitely be happier studying topology than I would evolutionary psychology, but I'm not sure that even matters. Studying mathematics would provide positive utility to me personally and allow me to teach it. Teaching mathematics would be valued positively by me both because of my direct enjoyment and because I value a universe where a given person knows and appreciates math more than an otherwise-identical universe where that person doesn't. The appearance of an FAI would by far outclass the former and likely negate the significance of the latter. A uFAI has such a low utility that it would cancel out any positive utility from studying math. In fact, even if I focus purely on the increase of logical processes and mathematical understanding in Homo Sapiens and neglect the negative effects of a uFAI, moving the creation of an FAI forward by even a matter of days could easily be of more end value than being a professor for twenty years.

I don't want to give up my unrealistic, idealized dream of math professorship to study a subject that makes me less happy, but if I shut up and multiply the numbers tell me that my happiness doesn't matter except as it affects my efficacy. In fact, shutting up and multiplying indicates that, if large amounts of labour were of significant use (and I doubt that would be any more use than large amounts of computing power) then it'd be plausible to at least consider subjugating the entire species and putting all effort to creating an FAI. I'm nearly certain this result comes from having missed something, but I can't see what and I'm scared that near-certainty is merely an expression of my negative anticipation regarding giving up my pretty little plans.

Eliezer routinely puts forward examples such as an AI that tiles the universe with molecular smiley faces as negative. My basic dilemma is this: Does the utility function at the time of the choice have some sort of preferred status in the calculation, or would it be highly positive to create an AI that rewrites brains to value above all else a universe tiled with molecular smiley faces and then tiles the universe with molecular smiley faces?

Personal Blog

5

New Comment

56 comments, sorted by

top scoring

Click to highlight new comments since: Today at 8:39 AM

[-]bryjnar13y40

Tonight, I am going to sneak into your house and rewire your brain so that you will become hell-bent on mass murder.

Now, I suspect this won't lead you to say, "Oh, well my uitlity function is going to change, so I should make sure to buy lots of knives today, when I don't look insane, so that it will be easier for my future self to satisfy his homicidal urges." Surely what we'd want to say is, "That's awful, I must make sure that I tell someone so that they'll be able to stop me!"

I think it's pretty clear that what you care about is what you care about now. It may be the case that one of the things you (currently) care about is that your future desires be fulfilled, even if there's some variance from what you now care about. But that's just one thing you care about, and you almost certainly care about people not getting stabbed to death more than that.

When thinking about future people, in particular, I think one thing a lot of us care about is that they have their preferences satisfied. That's a very general desire; it could be that future people will want to do nothing but paint eggs. If so, I might be a bit disappointed, but I still think we should try and enable that. However, if future people just wanted to torture innocent people all the time, then that would not be OK. The potential suffering far outweighs the satisfaction of their preferences.

This sort of pattern just fits the case where future people's utility (including that of your future self) is just one among others of the things that you care about right now. Obviously you have more reason to try and bring it about if you think that future people will be aiming at things that you also care about, but they're logically separate things.