The model people seem to have when making your argument is that utility has to be a linear function of our values: e.g. if I value pleasure, kittens, and mathematical knowledge, the way to express that in a utility function is something like 100 x pleasure + 50 x kittens + 25 x math. Obviously, if you then discover that you get a kitten for $1, but a pleasure costs $10 and a math costs $20, you'd just keep maximizing kittens forever to the exclusion of everything else, which is a problem.
Usually (outside the context of utility functions, even) the way we formulate the sentiment that each one of these matters is by taking products: e.g. pleasure^3 x kittens^2 x math (exponents allow us to weight the different values). In this case, while in the short term we might discover that kittens are the most cost-efficient way to higher utility, this does not continue to hold. If we have 100 kittens and only 2 maths, 1 additional kitten increases utility by about 2% while 1 additional math increases utility by 50%.
I agree with your first paragraph, but I think your second is just making the same mistake again. Why should our utility function be a product any more than it should be a sum? Why should it be mathematically elegant at all when nothing else about humans is?
Sorry, I don't mean to say that it has to be a product. All I'm saying is that the product formulation is one way to achieve a complex-value effect.
A product is the unique way to evaluate a group of values if we want to have the property that whenever we hold all values constant but one, the result scales linearly with the remaining value (or, to avoid the "linearly", we can apply some additional math to the value first). I don't think that in general this is true of our utility functions, but it might sometimes be a useful approximation.
The fully general definition of utility function is defined over possible outcomes, so if "lots of paper-clips but no staples" makes you unhappy and "lots of staples but no paper-clips" makes you just as unhappy while "lots of paper-clips and lots of staples" makes you very happy, then we can just say you assign 0 utility to the former two and 1 utility to the third.
It may be that some agents have 'fungible' utility, where one unit of X has the same value regardless of how much you already have or how much of anything else you have, but these utility functions form tiny fraction of all possible functions, so if you don't think your own function is of this type then it probably isn't.
these utility functions form tiny fraction of all possible functions, so if you don't think your own function is of this type then it probably isn't.
I generally agree with your conclusions in this comment, but I don't think that this is correct reasoning. The fact that a certain type of utility function is a really small fraction of all possible utility functions is not strong evidence for the conclusion that P(your utility function is not of this type | you don't think it is) because there may be a certain tendancy in human utility functions towards a certain type of function, even though functions of that type occupy an infinitesimally small fraction of function-space.
^ What he said.
Though note that utility is often fungible in small quantities, since I can always trade one good for another at market value (this fails when the quantities involved are large enough that the market can't absorb them without affecting price, or if the object in question is difficult to liquidate, e.g. time or knowledge).
Though note that utility is often fungible in small quantities, ...
If utility isn't fungible for large quantities, does that mean that it is rational to be scope-insensitive?
Yes but one has to be very careful. For humans scope-insensitivity usually occurs at ranges where the goods are still fungible. In the studies that Eliezer presents in that post, the issue is slightly different; here there are so many copies of a good X that adding or removing, say, 1000 of them does not affect the value of a single copy of X.
For instance, there are probably billions of birds in existence; if we would pay $80 to save 2000 birds when there are 1,000,000,000 of them, then we would probably also pay $80 to save 2000 birds when there are 999,998,000 of them. Repeating this argument a few times would mean that we should be willing to pay $800 to save 20000 birds, as opposed to the still $80 reported in the survey.
(For this argument to work entirely, we have to also argue that $800 is a small portion of a person's total wealth, which is true in most first world countries.)
...some agents have 'fungible' utility, where one unit of X has the same value regardless of how much you already have or how much of anything else you have...
What about the human desire for positive bodily sensations? Given what we currently know about physics, it should be much more efficient to cause them unconditionally than to realize them as a result of some actual achievement. Humans value such fictitious sensations, see movies or daydreams. So the value of such sensations is non-negligible. If we can create them effectively enough to outweigh the utility we assign to their natural realization, then isn't it rational to choose to indulge into unconditional satisfaction?
If only one of your values can be realized an unlimited times, then it only needs to yield one unit of utility per realization to outweigh all other values, as long as its realization is cost effective enough. Because as far as I know, the utility from realizing that one value is no different than the utility you can earn from any of your values, all that counts is the amount of utility you expect.
I do understand your argument, but I just explained why this need not be the case. My utility function does not have to assign a constant value to pleasant fictitious experiences. It does not need to explicitly assign any value to PFEs, only to outcomes. It may be possible to deduce from these outcomes a single unique value assigned to PFEs, but there's no reason why this has to be the case.
For instance, maybe my value for PFEs can't be realized an unlimited number of times because the more PFEs I have and the less real experiences I have the more value real experiences and the less I value PFEs. Even if watching a movie was the best part of my day, it does not mean I want to spend my whole day watching movies.
Not all functions are linear, or even analytic. Some are just pairs of numbers.
I do understand you as well. But I don't see how some people here seem to be able to make value statements about certain activities, e.g. playing the lottery is stupid. Or it is rational to try to mitigate risks from AI. I am still clueless how this can be justified if utility isn't objectively grounded, e.g. in units of bodily sensation. If I am able to arbitrarily assign utility to world states then I could as well assign utility to universes where I survive the Singularity without doing anything to mitigate it, enough to outweigh any others. In other words, I can do what I want and be rational as long as I am not epistemically confused about the consequences of my actions.
If that is the case, why are there problems like Pascal's mugging or infinite ethics? If utility maximization does not lead to focusing on few values that promise large amounts of utility, then there seem to be no problems. Just because I would save my loved one's doesn't mean that I want to spend the whole day saving infinitely many people.
In other words, I can do what I want and be rational as long as I am not epistemically confused about the consequences of my actions.
So what.
There are much more important things than being rational, at least to me. The world, for one. If all you really want to do is sit at home all day basking in your own rationality, then there's little I can do to argue that you aren't, but I would hope there's more to you than that (if there isn't, feel free to tell me and we can end this discussion).
I'm not sure I can honestly say that I place absolutely no terminal value on rationality, but most of the reason I am pursuing it is its supposed usefulness in achieving everything else.
When we say playing the lottery is stupid, we assume that you don't want to lose money, and when we say mitigating existential risk is rational we assume that you don't want the world to end. Generally humans aren't so very different that these assumptions aren't mostly justified.
Just because I would save my loved one's doesn't mean that I want to spend the whole day saving infinitely many people.
Some people take this very approach, they call it 'bounded utility'.
I don't agree with them because it seems to me like along the dimension of human life my utility function really is linear, or at least I would like it to be, but that's just me.
The general principle I'm trying to get at is to find what you actually want, as opposed to what is convenient, mathematically elegant or philosophically defensible, and make that your utility function. If you do this then expected utility should never lead you astray.
What I am trying to fathom is the difference between 1.) assigning utility arbitrarily (no objective grounding) 2.) grounding utility in units of bodily sensations 3.) grounding utility in units of human well-being (i.e. number of conscious beings whose life's are worth living).
As you see, my problem is that to me as a complete layman expected utility maximization seems to lead to the reduction of complex values once it is measured in some objectively verifiable physical fact. In other words, as long as utility is dimensionless, it seems to be an inconsistent measure, if you add a dimension it leads to the destruction of complex values.
The downvoting of the OP seems to suggest that some people seem to suspect that I am not honest, but I am really interested to learn more about this and how I am wrong. I am not trying to claim some insight here but merely ask people for help who understand a lot more about it than me. I am also not selfish, as some people seem to think? I care strongly about other humans and even lower animals.
how do you assign utility to novel goals that can't be judged in terms of previous goals
Don't think in terms of choosing what value to assign, think in terms of figuring out what value your utility functions already assigns to it (your utility function is a mathematical object that always has and always will exist).
So the answer is that you can't be expected to know yet what your value your utility function assigns to goals you haven't thought of, and this doesn't matter too much since uncertainty about your utility function can just be treated like any other uncertainty.
The downvoting of the OP seems to suggest that some people seem to suspect that I am not honest, but I am really interested to learn more about this and how I am wrong.
For the record, I voted the OP up, because it made me think and in particular made me realise my utility function wasn't additive or even approximately additive, which I had been unsure of before.
Don't think in terms of choosing what value to assign, think in terms of figuring out what value your utility functions already assigns to it...
I don't think that is possible. Consider the difference between a hunter-gatherer, who cares about his hunting success and to become the new clan chief, and a member of lesswrong who wants to determine if a "sufficiently large randomized Conway board could turn out to converge to a barren 'all off' state."
The utility of the success in hunting down animals and proving abstract conjectures about cellular automata is largely determined by factors such as your education, culture and environmental circumstances. The same hunter gatherer who cared to kill a lot of animals, to get the best ladies in its clan, might have under different circumstances turned out to be a vegetarian mathematicians solely caring about his understanding of the nature of reality. Both sets of values are to some extent mutually exclusive or at least disjoint. Yet both sets of values are what the person wants, given the circumstances. Change the circumstances dramatically and you change the persons values.
You might conclude that what the hunter-gatherer really wants is to solve abstract mathematical problems, he just doesn't know about that. But there is no set of values that a person really wants. Humans are largely defined by the circumstances they reside in. If you already knew a movie, you wouldn't watch it. To be able to get your meat from the supermarket changes the value of hunting.
If "we knew more, thought faster, were more the people we wished we were, and had grown up closer together" then we would stop to desire what we learnt, wish to think even faster, become even different people and get bored of and rise up from the people similar to us.
Nonlinear (decelerating) utility functions are a good way to get complex results. If I continued to like eating ice cream the same all the time, I might just eat ice cream every day. But instead the amount of utility it's worth depends on how much ice cream I've had recently and some other complicated stuff, so I eat ice cream occasionally and read books occasionally and play games with friends occasionally.
I like to call it "placing a high value on novelty and variety". Just like I've "made a habit of change [for the better]", I've taken a fundamentally static thing and turned it into a constantly changing variable.
maximize utility is more like a shorthand for saying "maximize the extent to which your values are being fulfilled."
The goal isn't to get utility points, but whatever your actual goals/values are. Utilities are just a way of keeping track. Utility functions are just a way of representing your values in ways that allows certain kinds of analysis to be applied.
If your values are such that you value having outcomes that fulfill a bunch of your values, then outcomes that squash the others at the expense of one would probably have diminishing returns, depending on, well, your values.
Does not follow necessary. A larger plethora of values can be of the greatest value.
Just like a fine dinner of many very good dishes is better than only the ice cream you like the most.
I don't say that it must always be so. But it can be constructed that way.
Just like a fine dinner of many very good dishes is better than only the ice cream you like the most.
Yes, but take for example companies. Companies are economic entities that resemble rational utility maximizer's much better than humans. Most companies specialize on producing one product or a narrow set of products. How can this be explained given that companies are controlled by humans for humans? It seems that adopting profit maximization leads to specialization which leads to simplistic values.
A larger plethora of values can be of the greatest value.
The large plethora of values mainly seems to be a result of human culture, a meme complex that is the effect of a society of irrational agents. Evolution only equipped us with a few drives. Likewise does rational utility maximization not favor the treatment of rare diseases in cute kittens. Such values are only being pursued because of agents who do not subscribe to rational utility maximization.
Can you imagine a society of perfectly rational utility maximizer's that, among other things, play World of Warcraft, lotteries and save frogs from being overrun by cars?
Specialization doesn't lead to simple values. You trade your extra goods for goods you don't produce.
Also, since it's easier to optimize for simpler values, we should expect to see better maximizers with simple than complex values.
Just like a fine dinner of many very good dishes is better than only the ice cream you like the most.
Yes, but take for example companies. Companies are economic entities that resemble rational utility maximizer's much better than humans. Most companies specialize on producing one product or a narrow set of products. How can this be explained given that companies are controlled by humans for humans? It seems that adopting profit maximization leads to specialization which leads to simplistic values.
Specialisation happens to bare humans too. Much the same thing happens with many species of ant.
It seems to have more to do with living in a large community.
I agree with you. In the most cases you are absolutely right.
I am just pointing out, that a specific situations CAN be constructed, where this general rule of yours does not hold.
But for the most (random) cases you are absolutely right.
Will we have dinner or just a lot of ice cream ... we shall see that. An elaborated state of the art dinner of smells, tastes, colors ... is possible. Instead of an ice cream glacier.
How can this [isolated utility maximization] be countered?
If the values are faithfully noted this is no problem. It strictly is what you'd want.
I think the danger is that the utility maximizer optimizes for (possibly only slightly) wrong values. The complex value space might accidentally admit some such optimization because of e.g. leaving out some obscure value with was previously overlooked.
A utility optimizer shoud not optimize faster than errors in the value function can be found i.e. faster than humans can give collective feedback on it.
That is one reason companies (mentioned in your comment) sometimes produce goods that are in high demand until the secondary effects are noticed which then may take some time to fix e.g. by legislation.
I don't think the speed of mere companes need to be slowed down to fix this (altough one might consider and model this). But more powerful utility maximizers definitely should time smooth their optimization process.
What if it turns out that one of its complex values can be much more effectively realized and optimized than its other values, i.e. has the best cost-value ratio?
If you're not entirely indifferent to costs, then minimising cost is itself already a value, and so it's already in your value system, and so this "new" knowledge about cost-value ratios changes nothing. Problem solved?
what says utility space must be 1D?
for example I would regard hyperbolic discounting and diminishing returns both as an example of 2D utility.
so doesn't much of this reduce to the novelty dimension of a utility calculation?
for example I would regard hyperbolic discounting and diminishing returns both as an example of 2D utility.
The OP was triggered by thinking about wireheading and the reasoning that leads some people to adopt it as desirable, or who don't perceive it as something that is obviously undesirable for humans. Here the first question that comes to my mind is if the adoption of expected utility maximization could reduce the complexity of human values to a narrow set, such as the maximization of desirable bodily sensations.
I reject wireheading myself. But on what basis? To have a consistent utility function one needs to know how to assign utility to new goals one might encounter. If utility is not objectively grounded in some physical fact, e.g. bodily sensations or qualia, then how could we judge if a new goal does outweigh some of our previous goals? For example, if a stone age hunter gatherer learns about the opera, how can he adapt its utility function to such a new goal that can not be defined in terms of previous goals? Should he assign some arbitrary amount of utility to it based on naive introspection? That seems equal to having no utility-function at all, or will at least lead to some serious problems. Here I suspect that a valid measure could be the amount of desirable bodily sensations that are expected as a result of taking a certain action.
If we were to measure utility in units of bodily sensations, then by maximizing utility we maximize bodily sensations. Consequently, as expected utility maximizer's we should assign most utility to universes where we experience the largest amount of desirable bodily sensations. This might lead to the reduction of complex values to the one action that yields the most utility, wireheading. I don't agree with this of course, just one line of thought to explain where I am coming from.
Now measuring utility in terms of bodily sensations might lead any human utility maximizer to abandon his complex values. But it will also exclude a lot of human values by the very unit in which utility is grounded. For example, humans care about other humans. Some humans even believe that we should not apply discounting to human well-being and that the mitigation of human suffering never hits diminishing returns. This is incompatible with the maximization of bodily sensations because you can't expect your positive feelings to increase linearly with the number of people you help (at least as long as you don't suspect that universal altruism isn't instrumental to your own long-term well-being etc.). But the example of where an objective grounding of utility in bodily sensations breaks down is also an example of another reduction of complex values to a narrow set of values, i.e. human well-being. If you measure utility in the number of beings you save, your complex values are again outweighed by a single goal, saving people.
ummm, I'm not quite sure if you're arguing or agreeing with me or what. I'm asserting that the function that assigns utility to a particular set of sensory data can have multiple parameters. If it turns out that a certain set of these parameters is a global maximum that's fine.
"Rational expected-utility-maximizing agents get to care about whatever the hell they want." - a good heuristic to bear in mind. There really are an awful lot of orderings on possible worlds, and if value is complex, your utility function* probably isn't linear.
*usual disclaimers apply about not actually having one.
The utility function only specifies the terminal values. If the utility function is difficult to maximize, then the agent will have to create a complex system of instrumental values. From the inside, terminal values and instrumental values feel pretty similar.
In particular, for a human agent, achieving a difficult goal is likely to involve navigating a dynamic social environment. There will therefore be instrumental social goals which act as stepping-stones to the terminal goal. For neurotypicals, this kind of challenge will seem natural and interesting. For those non-neurotypicals who have trouble socialising, working around this limitation becomes an additional sub-goal.
This isn't just theoretical - I'm describing my own experience since choosing to apply instrumental rationality to a goal system in which one value comes out dominating (for me this turns out to be Friendly AI).
What if it turns out that one of its complex values can be much more effectively realized and optimized than its other values, i.e. has the best cost-value ratio? That value might turn out to outweigh all other values.
Can you think of an example? I plugged in some values for myself:
...but they didn't seem to turn out too well in terms of illustrating your idea.
What exactly do you mean by complex value? Do you mean vector values? (e.g. a complex number can be expressed as a 2-vector).
If you want to maximize utility, then yes, you need a way to compare between two different utility values, and you can only do that if your function outputs a scalar value. If it outputs a vector value, you can't do that unless you assign some function to convert that vector into a scalar. That function can be a 2-norm, p-norm, or really, any type of arbitrary function you like
He is referring to Eliezer's complexity of value thesis
It has nothing to do with complex numbers, which in turn have little to do with vectors if I understand my maths correctly.
This is a side note, but if you have complicated desires and live in a complicated world, it's probably not possible to maximize your utility anyway. For instance, let's say you want to be queen of the universe, and you live in an infinite universe, with infinitely many agents in it at infinite levels of power. You may need to make some compromises.
I kind of wonder if a super happy wouldn't have a tendency to just leave an orgasmtron and seek out other forms of novelty after a while, all while having a base state of super-happiness/constant pleasure. It seems to me that even if they are both local maxima, superhappy is a higher peak than wirehead.
Does expected utility maximization destroy complex values?
An expected utility maximizer does calculate the expected utility of various outcomes of alternative actions. It is precommited to choosing the outcome with the largest expected utility. Consequently it is choosing the action that yields the largest expected utility.
But one unit of utility is not discriminable from another unit of utility. All a utility maximizer can do is to maximize expected utility. What if it turns out that one of its complex values can be much more effectively realized and optimized than its other values, i.e. has the best cost-value ratio? That value might turn out to outweigh all other values.
How can this be countered? One possibility seems to be changing one's utility function and reassign utility in such a way as to outweigh that effect. But this will lead to inconsistency. Another way is to discount the value that threatens to outweigh all others. Which will again lead to inconsistency.
This seems to suggest that subscribing to expected utility maximization means that 1.) you swap your complex values for a certain terminal goal with the highest expected utility 2.) your decision-making is eventually dominated by a narrow set of values that are the easiest to realize and promise the most utility.
Can someone please explain how I am wrong or point me to some digestible explanation? Likewise I would be pleased if someone could tell me what mathematical background is required to understand expected utility maximization formally.
Thank you!