momothefiddler

Wiki Contributions

Comments

Sorted by

I can't figure out an answer to any of those questions without having a way to decide which utility function is better. This seems to be a problem, because I don't see how it's even possible.

But why does it matter what they think about it for the short time before it happens, compared to the enjoyment of it long after?

So you positively value "eating ice cream" and negatively value "having eaten ice cream" - I can relate. What if the change, instead of making you dislike ice cream and like veggies, made you dislike fitness and enjoy sugar crashes? The only real difference I can see is that the first increases your expected lifespan and so increases the overall utility. They both resolve the conflict and make you happy, though, so aren't they both better than what you have now?

I guess you're right. It's the difference between "what I expect" and "what I want".

As far as I can tell, the only things that keep me from reducing myself to a utilon-busybeaver are a) insufficiently detailed information on the likelihoods of each potential future-me function, and b) an internally inconsistent utility function

What I'm addressing here is b) - my valuation of a universe composed entirely of minds that most-value a universe composed entirely of themselves is path-dependent. My initial reaction is that that universe is very negative on my current function, but I find it hard to believe that it's truly of larger magnitude than {number of minds}*{length of existence of this universe}*{number of utilons per mind}*{my personal utility of another mind's utilon}

Even for a very small positive value for the last (and it's definitely not negative or 0 - I'd need some justification to torture someone to death), the sheer scale of the other values should trivialize my personal preference that the universe include discovery and exploration.

Hm. If people have approximately-equivalent utility functions, does that help them all accomplish their utility better? If so, it makes sense to have none of them value stealing (since having all value stealing could be a problem). In a large enough society, though, the ripple effect of my theft is negligible. That's beside the point, though.

"Avoid death" seems like a pretty good basis for a utility function. I like that.

So you, like I, might consider turning the universe into minds that most value a universe filled with themselves?

I'm not saying I can change to liking civil war books. I'm saying if I could choose between A) continuing to like scifi and having fantasy books, or B) liking civil war books and having civil war books, I should choose B, even though I currently value scifi>stats>civil war. By extension, if I could choose A) continuing to value specific complex interactions and having different complex interactions, or B) liking smiley faces and building a smiley-face maximizer I should choose B even though it's counterintuitive. This one is somewhat more plausible, as it seems it'd be easier to build an AI that could change my values to smiley faces and make smiley faces than it would be to build one that works toward my current complicated (and apparently inconsistent) utility function.

I don't think society-damaging actions are "objectively" bad in the way you say. Stealing something might be worse than just having it, due to negative repercussions, but that just changes the relative ordering. Depending on the value of the thing, it might still be higher-ordered than buying it.

You're saying that present-me's utility function counts and no-one else's does (apart from their position in present-me's function) because present-me is the one making the decision? That my choices must necessarily depend on my present function and only depend on other/future functions in how much I care about their happiness? That seems reasonable. But my current utility function tells me that there is an N large enough that N utilon-seconds for other peoples' functions counts more in my function than any possible thing in the expected lifespan of present-me's utility function.

Say there's a planet, far away from ours, where gravity is fairly low, atmospheric density fairly high, and the ground uniformly dangerous, and the sentient resident species has wings and two feet barely fitted for walking. Suppose, also, that by some amazingly unlikely (as far as I can see) series of evolutionary steps, these people have a strong tendency to highly value walking and negatively value flying.

If you had the ability to change their hardwired values toward transportation (and, for whatever reason, did not have the ability to change their non-neural physiology and the nature of their planet), would it be wrong to do so? If it's wrong, what makes it wrong? Your (or my, because I seem to agree with you) personal negative-valuation of {changing someone else's utility function} is heavily outweighed by the near-constant increase in happiness for generations of these people. If anything, it appears it would be wrong not to make that change. If that's the case, though, the surely it'd be wrong not to build a superintelligence designed to maximise "minds that most-value the universe they perceive", which, while not quite a smiley-face maximizer, still leads to tiling behaviour.

No matter how I go at it reasonably, it seems tiling behaviour isn't necessarily bad. My emotions say it's bad, and Eliezer seems to agree. Does Aumann's Agreement Theorem apply to utility?

If I considered it high-probability that you could make a change and you were claiming you'd make a change that wouldn't be be of highly negative utility to everyone else, I might well prepare for that change. Because your proposed change is highly negative to everyone else, I might well attempt to resist or counteract that change. Why does that make sense, though? Why do other peoples' current utility functions count if mine don't? How does that extend to a situation where you changed everyone else? How does it extend to a situation where I could change everyone else but I don't have to? If an AI programmed to make its programmer happy does so by directly changing the programmer's brain to provide a constant mental state of happiness, why is that a bad thing?

Load More