There are two different kinds of questions that could be considered to fall under the subject of population ethics: “What sorts of altruistic preferences do I have about the well-being of others?”, and “Given all the preferences of each individual, how should we compromise?”. In other words, the first question asks how everyone's experiential utility functions (which measure quality of life) contribute to my (or your) decision-theoretic utility function (which takes into account everything that I or you, respectively, care about), and the second asks how we should agree to aggregate our decision-theoretic utility functions into something that we can jointly optimize for. When people talk about population ethics, they often do not make it clear which of these they are referring to, but they are different questions, and I think the difference is important.

 

For example, suppose Alice, Bob, and Charlie are collaborating on a project to create an artificial superintelligence that will take over the universe and optimize it according to their preferences. But they face a problem: they have different preferences. Alice is a total utilitarian, so she wants to maximize the sum of everyone's experiential utility. Bob is an average utilitarian, so he wants to maximize the average of everyone's experiential utility. Charlie is an egoist, so he wants to maximize his own experiential utility. As a result, Alice, Bob, and Charlie have some disagreements over how their AI should handle decisions that affect the number of people in existence, or which involve tradeoffs between Charlie and people other than Charlie. They at first try to convince each other of the correctness of their view, but they eventually realize that they don't actually have any factual disagreement; they just value different things. As a compromise, They program their AI to maximize the average of everyone's experiential utility, plus half of Charlie's experiential utility, plus a trillionth of the sum of everyone's experiential utility.

 

Of course, there are other ways for utility functions to differ than average versus total utilitarianism and altruism versus egoism. Maybe you care about something other than the experiences of yourself and others. Or maybe your altruistic preferences about someone else's experiences differs from their selfish preferences, like how a crack addict wants to get more crack while their family wants them not to.

 

Anyway, the point is, there are many ways to aggregate everyone's experiential utility functions, and not everyone will agree on one of them. In fact, since people can care about things other than experiences, many people might not like any of them. It seems silly to suggest that we would want a Friendly AI to maximize an aggregation of everyone's experiential utility functions; there would be potentially irresolvable disagreements over which aggregation to use, and any of them would exclude non-experiential preferences. Since decision-theoretic utility functions actually take into account all of an agent's preferences, it makes much more sense to try to get a superintelligence to maximize an aggregation of decision-theoretic utility functions.

 

The obvious next question is which aggregation of decision-theoretic utility functions to use. One might think that average and total utilitarianism could both be applied to decision-theoretic utility functions, but that is actually not so easy. Decision-theoretic utility functions take into account everything the agent cares about, which can include things that happen in the far future, after the agent dies. With a dynamic population, it is unclear which utility functions should be included in the aggregation. Should every agent that does or ever will exist have their utility function included? If so, then the aggregation would indicate that humans should be replaced with large numbers of agents whose preferences are easier to satisfy1 (this is true even for average utilitarianism, because there needs to be enough of these agents to drown out the difficult-to-satisfy human preferences in the aggregation). Should the aggregation be dynamic with the population, so that at time t, the preferences of agents who exist at time t are taken into account? That would be dynamically inconsistent. In a population of sadists who want to torture people (but only people who don't want to be tortured), the aggregation would indicate that they should create some people and then torture them. But then once the new people are created, the aggregation would take their preferences into account and indicate that they should not be tortured.

 

I suggest a variant that I'm tentatively calling current-population utilitarianism: Aggregate the preferences of the people who are alive right now, and then leave this aggregated utility function fixed even as the population and their preferences change. By “right now”, I don't mean June 17, 2014 at 10:26 pm GMT; I mean the late pre-singularity era as a whole. Why? Because this is when the people who have the power to affect the creation of the AGI that we will want to maximize said aggregated utility function live. If it were just up to me, I would program an AGI to maximize my own utility function2, but one person cannot do that on their own, and I don't expect I'd be able to get very many other people to go along with that. But all the people who will be contributing to an FAI project, and everyone whose support they can seek, all live in the near-present. No one else can support or undermine an FAI project, so why make any sacrifices for them for any reason other than that you (or someone who can support or undermine you) care about them (in which case their preferences will show up in the aggregation through your utility function)? Now I'll address some anticipated objections.

 

Objection: Doesn't that mean that people created post-singularity will be discriminated against?

Answer: To the extent that you want people created post-singularity not to be discriminated against, this will be included in your utility function.

 

Objection: What about social progress? Cultural values change over time, and only taking into account the preferences of people alive now would force cultural values to stagnate.

Answer: To the extent that you want cultural values to be able to drift, this will be included in your utility function.

 

Objection: What if my utility function changes in the future?

Answer: To the extent that you want your future utility function to be satisfied, this will be included in your utility function.

 

Objection: Poor third-worlders also cannot support or undermine an FAI project. Why include them but not people created post-singularity?

Answer: Getting public support requires some degree of political correctness. If we tried to rally people around the cause of creating a superintelligence that will maximize the preferences of rich first-worlders, I don't think that would go over very well.

 

One utility function being easier to satisfy than another doesn't actually mean anything without some way of normalizing the utility function, but since aggregations require somehow normalizing the utility functions anyway, I'll ignore that problem.

This is not a proclamation of extreme selfishness. I'm still talking about my decision-theoretic utility function, which is defined, roughly speaking, as what I would maximize if I had godlike powers, and is at least somewhat altruistic.

New Comment
21 comments, sorted by Click to highlight new comments since: Today at 4:59 AM

I take it that you don't think that political correctness (of the sort needed to get popular approval of a project now) requires taking into account the preferences of future people. In this, I suspect that you are correct.

But I also suspect that it doesn't really require taking into account the preferences of poor third-worlders either. It's enough if the rich first-worlders assume that their own preferences are universal among humankind, and this seems to be a common opinion among them.

I take it that you don't think that political correctness (of the sort needed to get popular approval of a project now) requires taking into account the preferences of future people.

Yes. Also, giving space in the aggregated utility function to potentially large numbers of future people with unknown preferences could be dangerous to us, whereas giving space in the utility function to the only 7 billion-ish other people in the world who have preferences that are human and thus not too radically different from ours costs us fairly little.

But I also suspect that it doesn't really require taking into account the preferences of poor third-worlders either. It's enough if the rich first-worlders assume that their own preferences are universal among humankind, and this seems to be a common opinion among them.

List rich-first-worlder values and say that you're maximizing those, and you can get applauded by rich first-worlders. Make a list of people whose values get to be included in the aggregation, which includes only rich first-worlders, and it won't go over well.

But if the rich first-worlders whose support you really need care about future people as well as about poor third-worlders (and I don't mean in the utility functions that you aggregate but in the political matters that you need to satisfy to get things to actually happen), then they'll insist that you take their preferences into account as well.

You may be right about what would be needed to get support for an AI project, that you'll need to explicitly take into account contemporary people who can't support or undermine your project but not future people. But it's not automatic. You'd really have to do surveys or something to establish it.

Yes, it is not automatically true that the class of people whose utility functions would have to be included for political reasons is necessarily the set of currently existing people. But again, including the utility functions of potentially large numbers of beings with potentially radically different values from ours could decrease the value to us of a universe maximized under the aggregation by a significant and maybe catastrophic amount, so if there does end up being substantial resistance to the idea of only including currently existing people, I think it would be worth arguing for instead of giving in on. I also think we live in the convenient world where that won't be a problem.

All right, I can buy that. Although it may be that a small compromise is possible: taking future people into consideration with a time discount large enough that radically different people won't muck things up. To be more specific (hence also more hypothetical), current people may insist on taking into account their children and grandchildren but not worry so much about what comes after. (Again, talking about what has to be explicitly included for political reasons, separate from what gets included via the actual utility functions of included people.) This is probably getting too hairsplitting to worry about any further. (^_^)

Moreover, politically correct values are not always values actually held by a majority of the population.

There are two different kinds of questions that could be considered to fall under the subject of population ethics: “What sorts of altruistic preferences do I have about the well-being of others?”, and “Given all the preferences of each individual, how should we compromise?”.

Instead of "population ethics", you might want to use "utilitarian aggregation" in this sentence. I'm pretty sure "population ethics" is never considered to cover the latter question, which besides "utilitarian aggregation" is also studied under "game theory" or perhaps "cooperative game theory" or "bargaining theory" to be more specific. (ETA: I wrote a post which links to some texts on cooperative game theory, which you may be interested in.)

I had gotten the impression that people often failed to distinguish between the two questions, although I suppose it's possible that they've all been refering to the first and that I've just been confused because when I hear "utility", I immediately think VNM. If that is the case, then I suppose you're right that I shouldn't be using population ethics there. I'm also somewhat averse to using "utility aggregation" in that sentence because I'd like that term to refer solely to the second question. [Edit: oops, I misread you. Perhaps "utilitarian aggregation" is a better term for the thing that includes both questions.]

Upvoted for this comment, which hints at a lot of fun ideas:

Objection: What about social progress? Cultural values change over time, and only taking into account the preferences of people alive now would force cultural values to stagnate.

Answer: To the extent that you want cultural values to be able to drift, this will be included in your utility function.

[-][anonymous]10y20

There is a type of value that I thought of that does not seem to be easy to aggregate.

Value X: Values having all of the aggregated Value system be Value X in finite time T. Achieves more value if the finite time T is shorter, Achieves less value if the finite time T is longer. Achieves no value if Value X does not have all of the Aggregated Value system be Value X in finite time. Is risk ambivalent, so a 50% chance of having all of the aggregated Value system being Value X in finite time T is half as good as all of the aggregated Value system being Value X in finite time T.

As an example of Value X, imagine a Blue Cult Member that believes when everyone is a Blue Cult Member that all Blue Cult Member go into heaven, which is awesome, and gives absurdly high amounts of utilons. Nothing matters other than this value.

I mean, you could say something like "Alright Blue Cult members, in the aggregation, we will give an epsilon chance of the AI making everyone become a Blue Cult member after each eon." This might give the Blue cult members value, but from other value systems perspective, it would probably be a lot like adding more existential risk to the system.

What might a resolution of aggregating across values that resemble this look like?

Including value X in the aggregation is easy: just include a term in the aggregated utility function that depends on the aggregation used in the future. The hard part is maximizing such an aggregated utility function. If Value X takes up enough of the utility function already, an AI maximizing the aggregation might just replace its utility function with Value X and start maximizing that. Otherwise, the AI would probably ignore Value X's preference to be the only value represented in the aggregation, since complying would cost it more utility elsewhere than it gains. There's no point to the lottery you suggest, since a lottery between two outcomes cannot have higher utility than either of the outcomes themselves. If Value X is easily satisfied by silly technicalities, the AI could build a different AI with the aggregated utility function, make sure that the other AI becomes more powerfull than it is, and then replace its own utility function with Value X.

I don't think your Blue Cult example works very well, because for them, the preference for everyone to join the Blue Cult is an instrumental rather than terminal value.

[-][anonymous]10y20

Thank you very much for helping me break that down!

They program their AI to maximize the average of everyone's experiential utility, plus half of Charlie's experiential utility, plus a trillionth of the sum of everyone's experiential utility.

It's important to note that each of them only agrees to this if they get more of whatever they want than they would without agreement. So if any of them can build their own AI, or expects to further their ends better with no AI than with the compromise AI, there's no agreement at all.

I think this is a political issue, not one with a single provably correct answer.

Think of it this way. Supposing you have 10 billion people in the world at the point at which several AIs get created. To simplify things, lets say that just four AIs get created, and each asks for resources to be donated to them, to further that AIs purpose, with the following spiel:

AI ONE - My purpose is to help my donors life long and happy lives. I will value aiding you (and just you, not your relatives or friends) in proportion to the resources you donate to me. I won't value helping non-donors, except in as far as it aids me in aiding my donors.

AI TWO - My purpose is to help those my donors want me to help. Each donor can specify a group of people (both living and future), such as "the species homo sapiens", or "anyone sharing 10% or more of the parts of my genome that vary between humans, in proportion to how similar they are to me", and I will aid that group in proportion to the resources you donate to me.

AI THREE - My purpose is to increase the average utility experienced per sentient being in the universe. If you are an altruist who cares most about quality of life, and who asks nothing in return, donate to me.

AI FOUR - My purpose is to increase the total utility experienced, over the life time of this universe by all sentient beings in the universe. I will compromise with AIs who want to protect the human species, to the extent that doing so furthers that aim. And, since the polls predict plenty of people will donate to such AIs, have no fear of being destroyed - do the right thing by donating to me.

Not all of those 10 billion have the same number of resources, or willingness to donate those resources to be turned into additional computer hardware to boost their chosen AI's bargaining position with the other AIs. But let us suppose that, after everyone donates and the AIs are created, there is no clear winner, and the situation is as follows:

AI ONE ends up controlling 30% of available computing resources, AI TWO also have 30%, AI THREE has 20% and AI FOUR has 20%.

And let's further assume that humanity was wise enough to enforce an initial "no negative bargaining tactics", so AI FOUR couldn't get away with threatening "Include me in your alliance, or I'll blow up the Earth".

There are, from this position, multiple possible solutions that would break the deadlock. Any three of the AIs could ally to gain control of sufficient resources to out-grow all others.

For example:

The FUTURE ALLIANCE - THREE and FOUR agree upon a utility function that maximises total utility under a constraint that expected average utility must, in the long term, increase rather than decrease, in a way that depends upon some stated relationship to other variables such as time and population. They then offer to ally with either ONE or TWO with a compromise cut off date, where ONE or TWO controls the future of the planet Earth up to that date, and THREE-FOUR controls everything beyond then, and they'll accept which ever of ONE or TWO bids the earlier date. This ends up with a winning bid from ONE of 70 years + a guarantee that some genetic material and a functioning industrial base will be left, at minimum, for THREE-FOUR to take over with after then.

The BREAD AND CIRCUSES ALLIANCE - ONE offers to suppose whoever can give the best deal for ONE's current donors and TWO, who has most in common with ONE and can clench the deal by itself, outbids THREE-FOUR.

The DAMOCLES SOLUTION - There is no unifying to create a single permanent AI with a compromise goals. Instead all four AIs agree to a temporary compromise, long enough to humanity to attain limited interstellar travel, at which point THREE and FOUR will be launched in opposite directions and will vacate Earth's solar system which (along with other solar systems containing planets within a pre-defined human habiltability range) will remain under the control of ONE-TWO. To enforce this agreement, a temporary AI is created and funded by the other four, with the sole purpose of carrying out the agreed actions and then splitting back into the constituent AIs at the agreed upon points.

Any of the above (and many other possible compromises) could be arrived at, when the four AIs sit down at the bargaining table. Which is agreed upon would depend upon the strength of bargaining position, and other political factors. There might well be 'campaign promises' made in the appeal for resources stage, with AIs voluntarily taking on restrictions on how they will further their purpose, in order to make themselves more attractive allies, or to poach resources by reducing the fears of donors.

[-][anonymous]10y00

To the extent that you want cultural values to be able to drift, this will be included in your utility function.

This seems handwavey to me; it's similar to saying "Just have the AI create an AI that's smarter than it but has all the same goals."

Once you change your utility function, it's hard to say that it will accomplish what you wanted the original utility function to accomplish at all.

Once you change your utility function...

I didn't suggest that you do that.

[-][anonymous]10y30

How are you changing the values you optimize for without changing your utility function? This now seems even more handwavey to me.

[-][anonymous]10y00

Consider a very simple model where the world has just two variables, represented by real numbers: cultural values (c) and the other variable (x). Our utility function is U(c, x)=c*x, which is clearly constant over time. However, our preferred value of x will strongly depend on cultural values: if c is negative, we want to minimize x, while if c is positive, we want to maximize x.

This model is so simple that it behaves quite strangely (e.g. it says you want to pick cultural values that view the current state of the world favorably), but it shows that by adding complexity to your utility function, you can make it depend on many things without actually changing over time.

[-][anonymous]10y10

Can you give me an example of this in reality? The math works, but I notice I am still confused, in that values should not just be a variable in utility function... they should in fact change the utility function itself.

If they're relegated to a variable, that seems to go against the original stated goal of wanting moral progress. in which case the utility function originally was constructed wrong.

values should not just be a variable in utility function

All else being equal for me, I'd rather other people have their values get satisfied. So their values contribute to my utility function. If we model this as their utility contributing to my utility function, then we get mutual recursion, but we can also model this as each utility function having a direct and an indirect component, where the indirect components are aggregations of the direct components of other people's utility functions, avoiding the recursion.

If they're relegated to a variable, that seems to go against the original stated goal of wanting moral progress.

To be more specific, people can value society's values coming more closely in line with their own values, or their own values coming more closely in line with what they would value if they thought about it more, or society's values moving in the direction they would naturally without the intervention of an AI, etc. Situations in which someone wants their own values to change in a certain way can be modeled as an indirect component to the utility function, as above.

[-][anonymous]10y00

Define the "partial utility function" as how utility changes with x holding c constant (i.e. U(x) at a particular value of c). Changes in values change this partial utility function, but they never change the full utility function U(c,x). A real-world example: if you prefer to vote for the candidate that gets the most votes, then your vote will depend strongly on the other voters' values, but this preference can still be represented by a single, unchanging utility function.

I don't understand your second paragraph - why would having values as a variable be bad? It's certainly possible to change the utility function, but AlexMennen's point was the future values could still be taken into account even with a static utility function. If the utility function is constant and also depends on current values, then it needs to values into account as an argument (i.e. a variable).