Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

Holden's Objection 1: Friendliness is dangerous

11 Post author: PhilGoetz 18 May 2012 12:48AM

Nick_Beckstead asked me to link to posts I referred to in this comment.  I should put up or shut up, so here's an attempt to give an organized overview of them.

Since I wrote these, LukeProg has begun tackling some related issues.  He has accomplished the seemingly-impossible task of writing many long, substantive posts none of which I recall disagreeing with.  And I have, irrationally, not read most of his posts.  So he may have dealt with more of these same issues.

I think that I only raised Holden's "objection 2" in comments, which I couldn't easily dig up; and in a critique of a book chapter, which I emailed to LukeProg and did not post to LessWrong.  So I'm only going to talk about "Objection 1:  It seems to me that any AGI that was set to maximize a "Friendly" utility function would be extraordinarily dangerous."  I've arranged my previous posts and comments on this point into categories.  (Much of what I've said on the topic has been in comments on LessWrong and Overcoming Bias, and in email lists including SL4, and isn't here.)

 

The concept of "human values" cannot be defined in the way that FAI presupposes

Human errors, human values:  Suppose all humans shared an identical set of values, preferences, and biases.  We cannot retain human values without retaining human errors, because there is no principled distinction between them.

A comment on this post:  There are at least three distinct levels of human values:  The values an evolutionary agent holds that maximize their reproductive fitness, the values a society holds that maximizes its fitness, and the values a rational optimizer holds who has chosen to maximize social utility.  They often conflict.  Which of them are the real human values?

Values vs. parameters:  Eliezer has suggested using human values, but without time discounting (= changing the time-discounting parameter).  CEV presupposes that we can abstract human values and apply them in a different situation that has different parameters.  But the parameters are values.  There is no distinction between parameters and values.

A comment on "Incremental progress and the valley":  The "values" that our brains try to maximize in the short run are designed to maximize different values for our bodies in the long run.  Which are human values:  The motivations we feel, or the effects they have in the long term?  LukeProg's post Do Humans Want Things? makes a related point.

Group selection update:  The reason I harp on group selection, besides my outrage at the way it's been treated for the past 50 years, is that group selection implies that some human values evolved at the group level, not at the level of the individual.  This means that increasing the rationality of individuals may enable people to act more effectively in their own interests, rather than in the group's interest, and thus diminish the degree to which humans embody human values.  Identifying the values embodied in individual humans - supposing we could do so - would still not arrive at human values.  Transferring human values to a post-human world, which might contain groups at many different levels of a hierarchy, would be problematic.

I wanted to write about my opinion that human values can't be divided into final values and instrumental values, the way discussion of FAI presumes they can.  This is an idea that comes from mathematics, symbolic logic, and classical AI.  A symbolic approach would probably make proving safety easier.  But human brains don't work that way.  You can and do change your values over time, because you don't really have terminal values.

Strictly speaking, it is impossible for an agent whose goals are all indexical goals describing states involving itself to have preferences about a situation in which it does not exist.  Those of you who are operating under the assumption that we are maximizing a utility function with evolved terminal goals, should I think admit these terminal goals all involve either ourselves, or our genes.  If they involve ourselves, then utility functions based on these goals cannot even be computed once we die.  If they involve our genes, they they are goals that our bodies are pursuing, that we call errors, not goals, when we the conscious agent inside our bodies evaluate them.  In either case, there is no logical reason for us to wish to maximize some utility function based on these after our own deaths.  Any action I wish to take regarding the distant future necessarily presupposes that the entire SIAI approach to goals is wrong.

My view, under which it does make sense for me to say I have preferences about the distant future, is that my mind has learned "values" that are not symbols, but analog numbers distributed among neurons.  As described in "Only humans can have human values", these values do not exist in a hierarchy with some at the bottom and some on the top, but in a recurrent network which does not have a top or a bottom, because the different parts of the network developed simultaneously.  These values therefore can't be categorized into instrumental or terminal.  They can include very abstract values that don't need to refer specifically to me, because other values elsewhere in the network do refer to me, and this will ensure that actions I finally execute incorporating those values are also influenced by my other values that do talk about me.

Even if human values existed, it would be pointless to preserve them

Only humans can have human values:

  • The only preferences that can be unambiguously determined are the preferences a person (mind+body) implements, which are not always the preferences expressed by their beliefs.
  • If you extract a set of consciously-believed propositions from an existing agent, then build a new agent to use those propositions in a different environment, with an "improved" logic, you can't claim that it has the same values, since it will behave differently.
  • Values exist in a network of other values.  A key ethical question is to what degree values are referential (meaning they can be tested against something outside that network); or non-referential (and hence relative).
  • Supposing that values are referential helps only by telling you to ignore human values.
  • You cannot resolve the problem by combining information from different behaviors, because the needed information is missing.
  • Today's ethical disagreements are largely the result of attempting to extrapolate ancestral human values into a changing world.
  • The future will thus be ethically contentious even if we accurately characterize and agree on present human values, because these values will fail to address the new important problems.


Human values differ as much as values can differ:  There are two fundamentally different categories of values:

  • Non-positional, mutually-satisfiable values (physical luxury, for instance)
  • Positional, zero-sum social values, such as wanting to be the alpha male or the homecoming queen

All mutually-satisfiable values have more in common with each other than they do with any non-mutually-satisfiable values, because mutually-satisfiable values are compatible with social harmony and non-problematic utility maximization, while non- mutually-satisfiable values require eternal conflict.  If you find an alien life form from a distant galaxy with non-positional values, it would be easier to integrate those values into a human culture with only human non-positional values, than to integrate already-existing positional human values into that culture.

It appears that some humans have mainly the one type, while other humans have mainly the other type.  So talking about trying to preserve human values is pointless - the values held by different humans have already passed the most-important point of divergence.

 

Enforcing human values would be harmful

The human problem:  This argues that the qualia and values we have now are only the beginning of those that could evolve in the universe, and that ensuring that we maximize human values - or any existing value set - from now on, will stop this process in its tracks, and prevent anything better from ever evolving.  This is the most-important objection of all.

Re-reading this, I see that the critical paragraph is painfully obscure, as if written by Kant; but it summarizes the argument: "Once the initial symbol set has been chosen, the semantics must be set in stone for the judging function to be "safe" for preserving value; this means that any new symbols must be defined completely in terms of already-existing symbols.  Because fine-grained sensory information has been lost, new developments in consciousness might not be detectable in the symbolic representation after the abstraction process.  If they are detectable via statistical correlations between existing concepts, they will be difficult to reify parsimoniously as a composite of existing symbols.  Not using a theory of phenomenology means that no effort is being made to look for such new developments, making their detection and reification even more unlikely.  And an evaluation based on already-developed values and qualia means that even if they could be found, new ones would not improve the score.  Competition for high scores on the existing function, plus lack of selection for components orthogonal to that function, will ensure that no such new developments last."

Averaging value systems is worse than choosing one:  This describes a neural-network that encodes preferences, and takes some input pattern and computes a new pattern that optimizes these preferences.  Such a system is taken as analogous for a value system and an ethical system to attain those values.  I then define a measure for the internal conflict produced by a set of values, and show that a system built by averaging together the parameters from many different systems will have higher internal conflict than any of the systems that were averaged together to produce it.  The point is that the CEV plan of "averaging together" human values will result in a set of values that is worse (more self-contradictory) than any of the value systems it was derived from.


A point I may not have made in these posts, but made in comments, is that the majority of humans today think that women should not have full rights, homosexuals should be killed or at least severely persecuted, and nerds should be given wedgies.  These are not incompletely-extrapolated values that will change with more information; they are values.  Opponents of gay marriage make it clear that they do not object to gay marriage based on a long-range utilitarian calculation; they directly value not allowing gays to marry.  Many human values horrify most people on this list, so they shouldn't be trying to preserve them.

Comments (428)

Comment author: RolfAndreassen 18 May 2012 03:58:18AM 13 points [-]

The human problem: This argues that the qualia and values we have now are only the beginning of those that could evolve in the universe, and that ensuring that we maximize human values - or any existing value set - from now on, will stop this process in its tracks, and prevent anything better from ever evolving. This is the most-important objection of all.

Better by which set of, ahem, values? And anyway, if evolution of values is a value, then maximising overall value will by construction take that into account.

Comment author: PhilGoetz 18 May 2012 04:16:06AM *  0 points [-]

Yes, I object less to CEV if you go one or two levels meta. But if evolution of values is your core value, you find that it's pretty hard to do better than to just not interfere except to keep the ecosystem from collapsing. See John Holland's book and its theorems showing that an evolutionary algorithm as described does optimal search.

Comment author: Wei_Dai 18 May 2012 10:34:45AM *  10 points [-]

Presumably, values will evolve differently depending on future contingencies. For example, a future with a world government that imposes universal birth control to limit population growth would probably evolve different values compared to a future that has no such global Singleton. Do you agree, and if so do you think the values evolved in different possible futures are all equivalent as far as you are concerned? If not, what criteria are you using to judge between them?

ETA: Can you explain John Holland's theorems, or at least link to the book you're talking about (Wikipedia says he wrote three). If you think allowing values to evolve is the right thing to do, I'm surprised you haven't put more effort into making a case for it, as opposed to just criticizing SI's plan.

Comment author: timtyler 18 May 2012 11:58:25PM *  1 point [-]

Probably Adaptation in Natural and Artificial Systems. Here's Holland's most famous theorem in the area. It doesn't suggest genetic algorithms make for some kind of optimal search - indeed, classical genetic algorithms are a pretty stupid sort of search.

Comment author: PhilGoetz 02 July 2012 01:17:27AM *  0 points [-]

That is the book. I"m referring to the entire contents of chapters 5-7. The schema theorem is used in chapter 7, but it's only part of the entire argument, which does show that genetic algorithms approach optimal distribution of trials among the different possibilities, for a specific definition of optimal, which is not easy to parse out of Holland's book, due to his failure to give an overview or decent summary of what he is doing. It doesn't say anything about other forms of search that proceed other than by taking a big set of possible answers, which give stochastic results when tested, and allocating trials among them.

Comment author: RolfAndreassen 18 May 2012 06:06:44PM 0 points [-]

CEV is not any old set of evolved values. It is the optimal set of evolved values; the set you get when everything goes exactly right. Of your two proposed futures, one of them is a better approximation to this than the other; I just can't say which one, at this time, because of lack of computational power. That's what we want a FAI for. :)

Comment author: Wei_Dai 18 May 2012 06:31:28PM *  2 points [-]

Instead of pushing Phil to accept the entirety of your position at once, it seems better to introduce some doubt first: Is it really very hard to do better than to just not interfere? If I have other values besides evolution, should I give them up so quickly?

Also, if Phil has already thought a lot about these questions and thinks he is justified in being pretty certain about his answers, then I'd be genuinely curious what his reasons are.

Comment author: RolfAndreassen 18 May 2012 08:06:29PM 3 points [-]

I misread the nesting, and responded as though your comment were a critique of CEV, rather than Phil's objection to CEV. So I talked a bit past you.

Comment author: TheOtherDave 18 May 2012 06:15:06PM 1 point [-]

But you're evading Wei_Dai's question here.

What criteria does the CEV-calculator use to choose among those options? I agree that significant computational power is also required, but it's not sufficient.

Comment author: RolfAndreassen 18 May 2012 08:09:17PM 1 point [-]

If we were able to formally specify the algorithm by which a CEV calculator should extrapolate our values, we would already have solved the Friendliness problem; your query is FAI-complete. But informally, we can say that the CEV evaluates by whatever values it has at a given step in its algorithm, and that the initial values are the ones held by the programmers.

Comment author: DanArmak 19 May 2012 03:45:09PM 1 point [-]

The problem with this kind of reasoning (as the OP makes plain) is that there's no good reason to think such CEV maximization is even logically possible. Not only do we not have a solution, we don't have a well-defined problem.

Comment author: TheOtherDave 18 May 2012 09:10:41PM 0 points [-]

(nods) Fair enough. I don't especially endorse that, but at least it's cogent.

Comment author: RolfAndreassen 18 May 2012 06:04:47PM 6 points [-]

The whole point of CEV is that it goes as many levels meta as necessary! And the other whole point of CEV is that it is better at coming up with strategies than you are.

Comment author: PhilGoetz 02 July 2012 01:23:07AM -1 points [-]

Please explain either one of your claims. For the first, show me where something Eliezer has written indicates CEV has some notion of how meta it is going, or how meta it "should" go, or anything at all relating to your claim. The second appears to merely be a claim that CEV is effective, so its use in any argument can only be presuming your conclusion.

Comment author: RolfAndreassen 02 July 2012 04:41:57AM -1 points [-]

In poetic terms, our coherent extrapolated volition is our wish if we knew more, thought faster, were more the people we wished we were, had grown up farther together; where the extrapolation converges rather than diverges, where our wishes cohere rather than interfere; extrapolated as we wish that extrapolated, interpreted as we wish that interpreted.

My emphasis. Or to paraphrase, "as meta as we require."

Comment author: PhilGoetz 26 August 2012 10:07:33PM 0 points [-]

Writing "I define my algorithm for problem X to be that algorithm which solves problem X" is unhelpful. Quoting said definition, doubly so.

In any case, the passage you quote says nothing about how meta to go. There's nothing meta in that entire passage.

Comment author: Armok_GoB 20 May 2012 12:07:53AM 0 points [-]

CEV goes infinite levels meta, that's what the "extrapolated" part means.

Comment author: CronoDAS 21 May 2012 05:33:49AM 1 point [-]

Countably infinite levels or uncountably infinite levels? ;)

Comment author: Armok_GoB 21 May 2012 08:29:25PM 1 point [-]

Countably I think, since computing power is presumably finite so the infinity argument relies on the series being convergent.

Comment author: PhilGoetz 02 July 2012 01:21:12AM 0 points [-]

No, that isn't what the "extrapolated" part means. The "extrapolated" part means closure and consistency over inference. This says nothing at all about the level of abstraction used for setting goals.

Comment author: Lightwave 18 May 2012 07:23:14AM 8 points [-]

It seems that what you have argued here is not much related to Holden's objection 1 - his objection is that we cannot reasonably expect a safe and secure implementation of a "Friendly" utility function (even if we had one), because humans have consistently been unable to construct bug-free working-correctly (computer) systems on the first try, proofs have been wrong, etc. You, on the other hand, are arguing against the Friendliness concept on object-level / meta-level ethical grounds.

Comment author: JoshuaZ 18 May 2012 01:38:26AM *  6 points [-]

Opponents of gay marriage make it clear that they do not object to gay marriage based on a long-range utilitarian calculation; they directly value not allowing gays to marry.

Well, most of them do so in part out of their deity telling them that that's a value. If the extrapolated CEV takes into account that they are just wrong about there being such a deity, it should respond accordingly. (I'm working under the what should not be controversial assumption that the AGI isn't going to find out that in fact there is such a deity hanging around.)

Comment author: thomblake 18 May 2012 02:07:03PM *  3 points [-]

I'm working under the what should not be controversial assumption that the AGI isn't going to find out that in fact there is such a deity hanging around.

Just as helpfully, if the FAI concludes that there is a deity around who we should please and who would prefer objecting to gay marriage, it will properly regard that as a value.

Comment author: TheOtherDave 18 May 2012 02:37:49PM 1 point [-]

Or, presumably, if it concludes that there might some day come to be a deity, or other vastly powerful entity, who would prefer having objected to gay marriage.

Of course, all of this further presumes that there aren't/won't be other vastly powerful entities whose preferences have equal weight in opposite directions.

Comment author: TimS 18 May 2012 02:36:45AM 6 points [-]

There's a chicken and egg issue here. Were pre-existing anti-homosexuality values co-opted into early Judaism? Or did the Judeo-Chiristian ideology spread the values beyond their "natural" spread? The only empirical evidence for this question I can think of is non-Judeo-Christian attitudes. What are the historical attitudes towards homosexuality among East Asians and South Asians?

More broadly, people's attitudes towards women and nerds are just as much expressions of values, not long-ranged utilitarian calculations.

Comment author: Luke_A_Somers 18 May 2012 03:20:40AM 6 points [-]

Like most of Leviticus, the edicts against homosexuality were an attempt to belatedly change 'have no gods before me' into 'don't have any other gods, period' by banning all of the specific religious practices of the competing local religions, which involved things like, say, eating shellfish, wearing sacred garb composed of mixed fibers, etc.

So maybe some of them were homophobes, but it's not necessary; and if they'd all been homophobes there wouldn't have been a need to establish the rule.

Comment author: TimS 18 May 2012 12:55:52PM 4 points [-]

That's a good point. It fairly strongly suggests that Judeo-Christian anti-homosexuality values would not survive coherent extrapolation because it provides an explanation for why the value was included originally. As JoshuaZ stated, I don't expect religious values whose sole function was religious in-group-ism to persist after a CEV process.

Comment author: army1987 20 May 2012 11:51:05AM 0 points [-]

Well, if Christian anti-homosexuality was just a religious in-group-ism, they wouldn't be outraged by non-Christians having sex with members of the other sex any more than by (say) non-Christians eating meat on Lent Fridays. Are they?

Comment author: [deleted] 18 May 2012 11:37:38AM *  7 points [-]

What are the historical attitudes towards homosexuality among East Asians and South Asians?

Man, that's variable. Especially in South Asia, where "Hinduism" is more like a nice box for outsiders to describe a huge body of different practices and theoretical approaches, some of them quite divergent. Chastity in general was and is a core value in many cases; where that's not the case, or where the particular sect deals pragmatically with the human sex drive despite teaching chastity as a quicker path to moksha, there might be anything from embrace of erotic imagery and sexual diversity to fairly strict rules about that sort of conduct. Some sects unabashedly embrace sexuality as a good thing, including same-sex sexuality. Islam has historically been pretty doctrinally down on it, but even that has its nuances -- sodomy was often considered a grave sin and still is in many places, while non-penetrative same-sex contact might well be seen as simply a minor thing, not strictly appropriate but hardly anything to get worked up about.

"East Asia" has a very large number of religions as well, and the influence of Confucianism and Buddhism hasn't been uniform in this regard. One vague generality that I might suggest as a rough guideline is that traditionally, homosexuality is sort of tolerated in the closet -- sure, it happens, but as long as everyone keeps up appearances and doesn't make a scene or get caught doing something inappropriate, it's no big deal. Some strains within Mahayana Buddhism have a degree of deprecation of sexual or gender-variant behavior; others don't. Theravada varies as well, but in different ways.

In both cases, cultures vary tremendously. If you widen the scope, many cultures, including many of the foregoing, have traditionally been a lot more accepting of sex and gender variance. There are and were some cultures that were extremely permissive about it.

Comment author: NancyLebovitz 20 May 2012 06:29:29AM 5 points [-]

If you want more on the subject of how people think about sexuality, try Straight by Hanne Blank. She tracks the invention of heterosexuality (a concept which she says is less than a century old) in the west.

If part of CEV is finding out how much of what we think is obviously true is just stuff that people made up, life could get very strange.

Comment author: army1987 20 May 2012 11:46:51AM *  2 points [-]

She tracks the invention of heterosexuality (a concept which she says is less than a century old) in the west.

The word is likely that recent, but is she claiming that the idea of being interested in members of the other sex but not in members of the same sex as sexual partners was unheard-of before that? Or what does she mean exactly?

Comment author: NancyLebovitz 21 May 2012 12:23:48AM 1 point [-]

It's a somewhat complex book, but part of her meaning is that the idea that there are people who are only sexually interested in members of the other sex, and that this is an important category, is recent.

Comment author: paper-machine 21 May 2012 05:52:40PM 0 points [-]

How could such a thesis be viable, when so much of the historical data has been lost?

Comment author: NancyLebovitz 21 May 2012 08:18:36PM 2 points [-]

There's more historical data than you might think-- for example, the way the Catholic Church defined sexual sin in terms of actions rather certain sins being associated with types of people who were especially tempted to engage in them.

There's also some history of how sexual normality became more and more narrowly defined (Freud has a lot to answer for), and then the definitions shifted.

A good bit of the book is available for free at amazon, and I think that would be the best way for you to see whether Blank's approach is reasonable.

Comment author: Sewing-Machine 21 May 2012 09:55:30PM *  6 points [-]

The introduction is a catalog of ambiguities about sex, gender, and sexual orientation:

My partner was diagnosed male at birth because he was born with, and indeed still has, a fully functioning penis ... My partner's DNA has a pattern that is simultaneously male, female and neither. This particular genetic pattern, XXY, is the signature of Kleinfelter syndrome ...

We've known full well since Kinsey that a large minority...37 percent...of men have hat at least one same-sex sexual experience in their lives.

No act of Congress of Parliament exists anywhere that defines exactly what heterosexuality is or regulates exactly how it is to be enacted.

Historians have tracked major shifts in other aspects of what was considered common or "normal" in sex and relationships: was marriage ideally an emotional relationship, or an economic and pragmatic one? Was romantic love desirable, and did it even really exist? Should young people choose their own spouses, or should marriage partners be selected by family and friends?

As unnumbered sailors, prisoners, and boarding-school boys have demonstrated, whether one behaves heterosexually or homosexually sometimes seems like little more than a matter of circumstance.

Masculinity does not look, sound, dress, or act the same for a rapper as for an Orthodox Jewish rabbinical student; a California surfer chick does femininity very differently from a New York City lady-who-lunches.

All of these are fair enough, and I've only read the introduction, but I don't have a lot of confidence that she goes on to resolve these contradictions in Less Wrong tree-falls-in-a-forest style. Instead of trying to clarify what people mean when they something like "most people are heterosexual," I get the feeling she only wants to muddy the waters enough to say "no they aren't."

Comment author: NancyLebovitz 21 May 2012 10:25:34PM 3 points [-]

I think her point is closer to "people make things up, and keep repeating those things until they seem like laws of the universe".

A possible conclusion is that once people make a theory about how something ought to be, it's very hard to go back to the state of mind of not having an opinion about that thing.

The amazon preview includes the last couple of chapters of the book.

The book could be viewed as a large expansion of two Heinlein quotes: "Everybody lies about sex" and "Freedom begins when you tell Mrs. Grundy to fly a kite".

Comment author: paper-machine 21 May 2012 09:36:41PM *  6 points [-]

Oh, so her thesis is that in the west, orientation-as-identity dates back to 1860-ish. I can imagine that being defensible. That's way different from what you originally wrote, though.

You see, the first thing that came to mind was Aristophanes' speech in the Symposium, which explicitly recognizes orientation-as-identity and predates the Catholic Church by a couple centuries.

Comment author: NancyLebovitz 21 May 2012 10:34:03PM 2 points [-]

Thanks for the cite.

Comment author: [deleted] 20 May 2012 06:51:59AM *  0 points [-]

Hell, you don't need CEV for that. A decent anthropology textbook will get you quite a distance there (even if only superficially)...

Comment author: [deleted] 20 May 2012 07:19:27AM 3 points [-]

Can you recommend a book / author? (Interested outsider, no idea what the good stuff is, have read Jared Diamond and similar works.)

Comment author: [deleted] 22 May 2012 04:58:51AM 3 points [-]

The Reindeer People by Piers Vitebsky is a favorite of mine, wich focuses on the Eveny people of Siberia. The Shaman's Coat: A Native History of Siberia, by Anna Read, is a good overview of SIberian peoples. Marshall Sahlins' entire corpus is pretty good, although his style puts some lay readers off. Argonauts of the Western Pacific by Branislaw Malinowski deals with Melanesian trade and business ventures. It's rather old at this point, but Malinowski had a fair influence on the development of anthropology thereafter. Wisdom Sits in Places by Keith Basso, which deals with an Apache group. The Nuer by EE Evans Pritchard is older, and very dry, but widely regarded as a classic in the field. It deals with the Nuer people of Sudan. The Spirit Catches You And You Fall Down by Ann Fadiman is not strictly an ethnography, but it's very relevant to anthropological mindsets and is often required reading in first-year courses in the field. Liquidated: An Ethnography of Wall Street by Karen Ho, is pretty much what it says in the title, and a bit more contemporary. Debt: The First 5000 Years by David Graeber mixes in history and economics, but it's generally relevant. Pathologies of Power by Paul Farmer focuses on the poor in Haiti. Friction: An Ethnography of Global Connection by Ana Tsing is kind of complicated to explain. Short version: it takes a look at events in Indonesia and traces out actors, groups, their motivating factors, and so on.

Comment author: NancyLebovitz 20 May 2012 06:59:36AM 1 point [-]

I wonder whether people who've studied anthropology find that it's affected their choices.

Comment author: [deleted] 20 May 2012 10:00:26AM 0 points [-]

It certainly did mine.

Comment author: NancyLebovitz 20 May 2012 11:13:43AM 2 points [-]

I'm interested in any details you'd like to share.

Comment author: [deleted] 22 May 2012 05:48:18AM *  15 points [-]

It made me a lot more comfortable dealing with people who might be seen as "regressive", "bland", "conservative" or just who seem otherwise not very in-synch with my own social attitudes and values. Getting to understand that culture and culturally-transmitted worldviews do constitute umbrella groups, but that people vary within them to similar degrees across such umbrellas, made it easier to just deal with people and adapt my own social responses to the situation, and where I feel like the person has incorrect, problematic or misguided ideas, it made it easier to choose my responses and present them effectively.

It made me more socially-conscious and a bit more socially-successful. I have some considerable obstacles there, but just having cultural details available was huge in informing my understanding of certain interactions. When I taught ESL, many of my students were Somali and Muslim. I'm also trans, and gender is a very big thing in many Islam-influenced societies (particularly ones where men and women for the most part don't socialize). I learned a bit about fashion sense and making smart choices just by noticing how the men reacted to what I wore, particularly on hot days. I learned a lot about gender-marked social behavior and signifiers from my interactions with the older women in the class and the degree to which they accepted me (which I could gauge readily by their willingness to engage in casual touch, say to get my attention or when thanking me, or the occasional hug from some of my students).

It made me a far better worldbuilder than I was before, because I have some sense of just how variable human cultures really are, and how easy it is to construct a superficially-plausible theory of human cultures, history or behavior while missing out on the incredible variance that actually exists.

It made me far less interested in evolutionary psychology as an explanation for surface-level behaviors, let alone broad social patterns of behavior, because all too often cited examples turn out to be culturally-contingent. I think the average person in Western society has a very confused idea of just how different other cultures can be.

It made me skeptical of CEV as a thing that will return an output. I'm not sure human volition can be meaningfully extrapolated, and even if it can, I'm far from persuaded that the bits of it that cohere add up to anything you'd base FAI on.

It convinced me that the sort of attitudes I see expressed on LW towards "tradition" and traditional culture (especially where that experiences conflict with global capitalism) are so hopelessly confused about the thing they're trying to address that they essentially don't have anything meaningful to say about it, or at best only cover a small subset of the cases that they're applied to. It didn't make me a purist or instill some sort of half-baked Prime Directive or anything; cultures change and they'll do that no matter what.

It helped me grasp my own cultural background and influences better. It gave me some insight into the ways in which that can lock in your perceptions and decisions, and how hard that is to change that, and how easy it is to confuse that with something "innate" (and how easy it is to confuse "innate" with "genetic"). It helped me grasp how I could substitute or reprogram bits of that, and with a bit of time and practice it helped me understand the limitations on that.

There's...probably a whole ton more, but I'm running out of focus right now.

EDIT: Oh! It made me hugely more competent at navigating, interpreting and understanding art, especially from other cultures. Literary modes, aesthetics, music and styles; also narrative and its uses.

Comment author: Eliezer_Yudkowsky 22 May 2012 07:14:42PM *  10 points [-]

Fascinating, but... my Be Specific detector is going off and asking, not just for the abstract generalizations you concluded, but the specific examples that made you conclude them. Filling in at least one case of "I thought I should dress like X, but then Y happened, now I dress like Z", even - my detector is going off because all the paragraphs are describing the abstract conclusions.

Comment author: Zack_M_Davis 22 May 2012 06:08:26AM 6 points [-]

It convinced me that the sort of attitudes I see expressed on LW towards "tradition" and traditional culture [...] are so hopelessly confused about the thing they're trying to address that they essentially don't have anything meaningful to say about it

(I think this could make an interesting and valuable top-level post.)

Comment author: CronoDAS 21 May 2012 05:31:50AM 0 points [-]

Me too.

Comment author: JoshuaZ 18 May 2012 02:40:12AM *  2 points [-]

I don't know the history in East Asia, but closer to where the Abrahamic religions arose one had the ancient Greeks who were ok with most forms of homosexuality. The only reservations they had about homosexuality as I understand it had to do with issues of honor if one were a male who was penetrated.

Edit: I get the impression from this article that the attitudes of ancient Indians to homosexuality has become so bogged down in modern politics that it may be difficult for non-experts to tell. I'll try to look into this more later.

Comment author: army1987 20 May 2012 11:43:22AM 0 points [-]

The only empirical evidence for this question I can think of is non-Judeo-Christian attitudes.

IIRC, in pre-Christian Rome/Greece, homosexuality was considered OK only if the receiving partner was young enough.

Comment author: DanArmak 19 May 2012 03:57:29PM 1 point [-]

Extrapolated CEV would be working from observable evidence + a good prior. Whereas lots of people insist it's very important to them to believe in a deity through faith, despite any contrary evidence (let alone lack of evidence). How are you going to tell the CEV to ignore such values?

Comment author: paper-machine 18 May 2012 01:27:07AM 3 points [-]

(This is a revealing post, in that it takes the problem of values and treats it in a mathematically-precise way, and received many downvotes without any substantive objections to either the math or to the analogy asserting that the math is appropriate. I have found in other posts as well that making a mathematical argument based on an abstraction results in more downvotes than does merely arguing from a loose analogy.)

(emphasis added.)

Except Peter de Blanc's comments.

Comment author: ciphergoth 18 May 2012 06:58:45AM 2 points [-]

Now that the huffy remark has been removed, I can't see what post it used to refer to!

Comment author: PhilGoetz 18 May 2012 03:15:55AM *  1 point [-]

Peter deBlanc is a better mathematician than I am, so I'd better look at them.

ADDED. I see I responded to them before. I think they're good points but don't invalidate the model. I'll retract my huffy statement from the post, though.

Comment author: paper-machine 18 May 2012 03:31:59AM 1 point [-]

The point of his remarks, in my view, was that your model needed validation in the first place. Every mathematical biology or computational cognitive science paper I've read makes some attempt to rationalize why they are bothering to examine whatever idealized model is under consideration.

Comment author: Ghatanathoah 23 May 2012 07:27:16AM *  6 points [-]

Non-positional, mutually-satisfiable values (physical luxury, for instance) Positional, zero-sum social values, such as wanting to be the alpha male or the homecoming queen

All mutually-satisfiable values have more in common with each other than they do with any non-mutually-satisfiable values, because mutually-satisfiable values are compatible with social harmony and non-problematic utility maximization, while non- mutually-satisfiable values require eternal conflict.

David Friedman pointed out that this isn't correct, it's actually it's quite easy to make positional values mutually satisfiable:

It seems obvious that, if one's concern is status rather than real income, we are in a zero sum game..... Like many things that seem obvious, this one is false. It is true that my status is relative to yours. It does not, oddly enough, follow that if my status is higher than yours, yours must be lower than mine, or that if my status increases someone else's must decrease. Status is not, in fact, a zero sum game.

This point was originally made clear to me when I was an undergraduate at Harvard and realized that Harvard had, in at least one interesting way, the perfect social system: Everyone at the top of his own ladder. The small minority of students passionately interested in drama knew perfectly well that they were the most important people at the university; everyone else was there to provide them with an audience....

Being a male nurse is not a terribly high status job—but that may not much matter if you are also King of the Middle Kingdom. And the status you get by being king does not reduce the status of the doctors who know that they are at the top of the medical ladder and the nurses at the bottom.

[Emphasis mine]

A FAI could simply make sure that everyone is a member of enough social groups that everyone has high status in some of them. Positional goals can be mutually satisficed, if one is smart enough about it. Those two types of value don't differ as much as you seem to think they do. Positional goals just require a little more work to make implementing them conflict-free than the other type does.

If you extract a set of consciously-believed propositions from an existing agent, then build a new agent to use those propositions in a different environment, with an "improved" logic, you can't claim that it has the same values, since it will behave differently.

I don't think I agree with this. Couldn't you take that argument further and claim that if I undergo some sort of rigorous self-improvement program in order to better achieve my goals in life, that that must mean I now have different values? In fact, you could easily say that I am behaving pointlessly because I'm not achieving my values better, I'm just changing them? It seems likely that most of the things that you are describing as values aren't really values, they're behaviors. I'd regard values as more "the direction in which you want to steer the world," both in terms of your external environment and your emotional states. Behaviors are things you do, but they aren't necessarily what you really prefer.

I agree that a more precise and articulate definition of these terms might be needed to create a FAI, especially if human preferences are part of a network of some sort as you claim, but I do think that they cleave reality at the joints.

I can't really see how you can attack CEV by this route without also attacking any attempt at self-improvement by a person.

A point I may not have made in these posts, but made in comments, is that the majority of humans today think that women should not have full rights, homosexuals should be killed or at least severely persecuted, and nerds should be given wedgies. These are not incompletely-extrapolated values that will change with more information; they are values. Opponents of gay marriage make it clear that they do not object to gay marriage based on a long-range utilitarian calculation; they directly value not allowing gays to marry. Many human values horrify most people on this list, so they shouldn't be trying to preserve them.

The fact that these values seem to change or weaken as people become wealthier and better educated indicates that they probably are poorly extrapolated values. Most of these people don't really want to do these things, they just think they do because they lack the cognitive ability to see it. This is emphasized by the fact that these people, when called out on their behavior, often make up some consequentialist justification for it (if I don't do it God will send an earthquake!)

I'll use an example from my own personal experience to illustrate this, when I was little (around 2-5) I thought horror movies were evil because they scared me. I didn't want to watch horror movies or even be in the same room with a horror movie poster. I thought people should be punished for making such scary things. Then I got older and learned about freedom of speech and realized that I had no right to arrest people just because they scare me.

Then I got even older and started reading movie reviews. I became a film connoisseur and became sick of hearing about incredible classic horror movies, but not being able to watch them because they scared me. I forced myself to sit through Halloween, A Nightmare on Elm Street, and The Grudge, and soon I was able to enjoy horror movies like a normal person.

Not watching horror movies and punishing the people who made them were the preferences of young me. But my CEV turned out to be "Watch horror movies and reward the people who create them." I don't think this was random value drift, I think that I always had the potential to love horror movies and would have loved them sooner if I'd had the guts to sit down and watch them. The younger me didn't have different terminal values, his values were just poorly extrapolated.

I think most of the types of people you mention would be the same if they could pierce through their cloud of self-deception. I think their values are wrong and that they themselves would recognize this if they weren't irrational. I think a CEV would extrapolate this.

But even if I'm wrong, if there's a Least Convenient Possible world where there are otherwise normal humans who have "kill all gays" irreversibly and directly programmed into their utility function, I don't think a CEV of human morality would take that into account. I tend to think that, from an ethical standpoint, malicious preferences (that is, preferences where frustrating someone else's desires is an end in itself, rather than a byproduct of competing for limited resources) deserve zero respect. I think that if a CEV took properly extrapolated human ethics it would realize this. It might not hurt to be extra careful about that when programming a CEV, however.

Comment author: [deleted] 23 May 2012 08:13:34AM 1 point [-]

I don't think this was random value drift, I think that I always had the potential to love horror movies and would have loved them sooner if I'd had the guts to sit down and watch them.

I had a somewhat similar experience growing up, although a few details are different (I never thought people should be banned from making such films or that they were evil things just because they scared me, for instance, and I made the decision to try watching some of them, mostly Alien and a few other works from the same general milieu, at a much younger age and for substantially different reasons). However, I didn't wind up loving horror movies; I wound up liking one or two films that only pushed my buttons in nice, predictable places and without actually squicking me per se. I honestly still don't get how someone can sit through films like Halloween or Friday the 13th -- I mean, I get the narrative underpinnings and some of the psychological buttons they push very well (reminds me of ghost tales and other things from my youth), but I can't actually feel the same way as your putative "normal person" when sitting through it. Even movies most people consider "very tame" or "not actually scary" make me too uncomfortable to want to sit through them, a good portion of the time. And I've actively tried to cultivate this, not for its own sake (I could go my whole life never sitting through such a film again and not be deprived, even one of the ones I've enjoyed many times) but because of the small but notable handful of horror-themed movies that I do like and the number of people I know who enjoy such films with whom I'd have even more social-yay if I did self-modify to enjoy those movies. It simply didn't take -- after much exposure and effort, I now find most such films both squicky and actively uninteresting. I can see why other people like 'em, but I can't relate.

Are my terminal values "insufficiently extrapolated?" Or just not coherent with yours?

Comment author: Ghatanathoah 23 May 2012 08:56:59AM 0 points [-]

Are my terminal values "insufficiently extrapolated?" Or just not coherent with yours?

I don't think it's either. We both have the general value, "experience interesting stories," it's just expressed in slightly different ways. I don't think that really really specific preferences for art consumption would be something that CEV extrapolates. I think CEV is meant to figure out what general things humans value, not really specific things (i.e. a CEV might say, "you want to experience fun adventure stories," it would not say "read Green Lantern #26" or "read King Solomon's Mines"). The impression I get is that CEV is more about general things like "How should we treat others?" and "How much effort should we devote to liking activities vs. approving ones?"

I don't think our values are incoherent, you don't want to stop me from watching horror movies and I don't want to make you watch them. In fact, I think a CEV would probably say "It's good to have many people who like different activities because that makes life more interesting and fun." Some questions (like "Is it okay to torture people") likely only have one true, or very few true, CEVs, but others, like matters of personal taste, probably vary from person to person. I think a FAI would probably order everyone not to torture toddlers, but I doubt it would order us all to watch "Animal House" at 9:00pm this coming Friday.

Comment author: thomblake 23 May 2012 02:18:35PM 0 points [-]

David Friedman pointed out that this isn't correct, it's actually it's quite easy to make positional values mutually satisfiable:

I'm glad you pointed this out - I don't think this view is common enough around here.

Comment author: Jayson_Virissimo 23 May 2012 07:49:33AM 0 points [-]

But even if I'm wrong, if there's a Least Convenient Possible world where there are otherwise normal humans who have "kill all gays" irreversibly and directly programmed into their utility function, I don't think a proper CEV would take that into account.

I'm not sure what to make of your use of the word "proper". Are you predicting that a CEV will not be utilitarian or saying that you don't want it to be?

Comment author: Ghatanathoah 23 May 2012 08:26:05AM *  -1 points [-]

I am saying that a CEV that extrapolated human morality would generally be utilitarian, but that it would grant a utility value of zero to satisfying what I call "malicious preferences." That is, if someone valued frustrating someone else's desires purely for their own sake, not because they needed the resources that person was using or something like that, the AI would not fulfill it.

This is because I think that a CEV of human morality would find the concept of malicious preferences to be immoral and discard or suppress it. My thinking on this was inspired by reading about Bryan Caplan's debate with Robin Hanson, where Bryan mentioned:

...Robin endorses an endless list of bizarre moral claims. For example, he recently told me that "the main problem" with the Holocaust was that there weren't enough Nazis! After all, if there had been six trillion Nazis willing to pay $1 each to make the Holocaust happen, and a mere six million Jews willing to pay $100,000 each to prevent it, the Holocaust would have generated $5.4 trillion worth of consumers surplus.

I don't often agree with Bryan's intuitionist approach to ethics, but I think he made a good point, satisfying the preferences of those trillion Nazis doesn't seem like part of the meaning of right, and I think a CEV of human ethics would reflect this. I think that the preference of the six million Jews to live should be respected and the preferences of the six trillion Nazis be ignored.

I don't think this is because of scope insensitivity, or because I am not a utilitarian. I endorse utilitarian ethics for the most part, but think that "malicious preferences" have zero or negative utility in their satisfaction, no matter how many people have them. For conflicts of preferences that involve things like disputes over use of scarce resources, normal utilitarianism applies.

In response to your question I have edited my post and changed "a proper CEV" to "a CEV of human morality."

Comment author: thomblake 23 May 2012 02:15:58PM 0 points [-]

I am saying that a CEV that extrapolated human morality would generally be utilitarian, but that it would grant a utility value of zero to satisfying what I call "malicious preferences."

This is because I think that a CEV of human morality would find the concept of malicious preferences to be immoral and discard or suppress it.

Zero is a strange number to have specified there, but then I don't know the shape of the function you're describing. I would have expected a non-specific "negative utility" in its place.

Comment author: Ghatanathoah 23 May 2012 04:08:47PM -1 points [-]

Zero is a strange number to have specified there, but then I don't know the shape of the function you're describing. I would have expected a non-specific "negative utility" in its place.

You're probably right, I was typing fairly quickly last night.

Comment author: Jayson_Virissimo 23 May 2012 09:29:11AM *  0 points [-]

I don't think this is because of scope insensitivity, or because I am not a utilitarian. I endorse utilitarian ethics for the most part, but think that "malicious preferences" have zero or negative utility in their satisfaction, no matter how many people have them. For conflicts of preferences that involve things like disputes over use of scarce resources, normal utilitarianism applies.

Ah, okay. This sounds somewhat like Nozick's "utilitarianism with side-constraints". This position seems about as reasonable as the other major contenders for normative ethics, but some LessWrongers (pragmatist, Will_Sawin, etc...) consider it to be not even a kind of consequentialism.

Comment author: [deleted] 18 May 2012 01:33:14AM 9 points [-]

"A point I may not have made in these posts, but made in comments, is that the majority of humans today think that women should not have full rights, homosexuals should be killed or at least severely persecuted, and nerds should be given wedgies. These are not incompletely-extrapolated values that will change with more information; they are values. Opponents of gay marriage make it clear that they do not object to gay marriage based on a long-range utilitarian calculation; they directly value not allowing gays to marry. Many human values horrify most people on this list, so they shouldn't be trying to preserve them."

This has always been my principal objection to CEV. I strongly suspect that were it implemented, it would want the death of a lot of my friends, and quite possibly me, too.

Comment author: CronoDAS 20 May 2012 10:41:46PM 4 points [-]

Regarding CEV: My own worry is that lots of parts of human value get washed out as "incoherent" - whatever X is, if it isn't a basic human biological drive, there are enough people out there that have different opinions on it to make CEV throw up its hands, declare it an "incoherent" desire, and proceed to leave it unsatisfied. As a result, CEV ends up saying that the best we can do is just make everyone a wirehead because pleasure is one of our few universal coherent desires while things like "self-determination" and "actual achievement in the real world" are a real mess to provide and barely make sense in the first place. Or something like that.

(Universal wireheading - with robots taking care of human bodies - at least serves as a lower bound on any proposed utopia; people, in general, really do want pleasure, even if they also want other things. See also "Reedspace's Lower Bound".)

Comment author: steven0461 20 May 2012 11:01:25PM *  3 points [-]

I would like to see more discussion on the question of how we should distinguish between 1) things we value even at the expense of pleasure, and 2) things we mistakenly alieve are more pleasurable than pleasure.

Comment author: TheOtherDave 20 May 2012 11:27:06PM 0 points [-]

Surely if there is something I will give up pleasure for, which I do not experience as pleasurable, that's strong evidence that it is an example of 1 and not 2?

Comment author: steven0461 20 May 2012 11:32:07PM *  3 points [-]

Yes, but there are other cases. If you prefer eating a cookie to having the pleasure centers in your brain maximally stimulated, are you sure that's not because eating a cookie sounds on some level like it would be more pleasurable?

Comment author: TheOtherDave 21 May 2012 02:23:49AM 0 points [-]

I'm not sure how I could ever be sure of such a thing, but it certainly seems implausible to me.

Comment author: TimS 18 May 2012 02:16:05AM 4 points [-]

That's a little unfair to the concept of CEV. If irreconcilable value conflicts persist after coherent extrapolation, I would think that a CEV function would output nothing, rather than using majoritarian analysis to resolve the conflict.

Comment author: [deleted] 18 May 2012 02:55:19PM 4 points [-]

Then since there is not one single value about which every single human being on the planet can agree, a CEV function would output nothing at all.

Comment author: thomblake 18 May 2012 02:58:16PM 1 point [-]

If irreconcilable value conflicts persist after coherent extrapolation

Then since there is not one single value about which every single human being on the planet can agree

Tense confusion.

Comment author: [deleted] 18 May 2012 03:13:45PM 7 points [-]

CEV is supposed to preserve those things that people value, and would continue to value were they more intelligent and better informed. I value the lives of my friends. Many other people value the death of people like my friends. There is no reason to think that this is because they are less intelligent or less well-informed than me, as opposed to actually having different preferences. TimS claimed that in a situation like that, CEV would do nothing, rather than impose the extrapolated will of the majority.

My claim is that there is nothing -- not one single thing -- which would be a value held by every person in the world, even were they more intelligent and better informed. An intelligent, informed psychopath has utterly different values from mine, and will continue to have utterly different values upon reflection. The CEV therefore either has to impose the majority preferences upon the minority, or do nothing at all.

Comment author: thomblake 18 May 2012 03:22:28PM 2 points [-]

There is no reason to think that this is because they are less intelligent or less well-informed than me, as opposed to actually having different preferences.

There are lots of reasons to think so. For example, they might want the death of your friends because they mistakenly believe that a deity exists.

Comment author: [deleted] 18 May 2012 03:36:51PM 4 points [-]

Or for any number of other, non-religious reasons. And it could well be that extrapolating those people's preferences would lead, not to them rejecting their beliefs, but to them wishing to bring their god into existence.

Either people have fundamentally different, irreconcilable, values or they don't. If they do, then the argument I made is valid. If they don't, then CEV(any random person) will give exactly the same result as CEV(humanity).

That means that either calculating CEV(humanity) is an unnecessary inefficiency, or CEV(humanity) will do nothing at all, or CEV(humanity) would lead to a world that is intolerable for at least some minority of people. I actually doubt that any of the people from the SI would disagree with that (remember the torture vs flyspecks argument).

That may be considered a reasonable tradeoff by the developers of an "F"AI, but it gives those minority groups to whom the post-AI world would be inimical equally rational reasons to oppose such a development.

Comment author: TimS 18 May 2012 04:11:29PM *  3 points [-]

As someone who does not believe in moral realism, I agree that CEV over all humans who ever lived (excluding sociopaths and such) will not output anything.

But I think that a moral realist should believe that CEV will output some value system, and that the produced value system will be right.

In short, I think one's belief about whether CEV will output something is isomorphic on whether one believes in [moral realism] (plato.stanford.edu/entries/moral-realism/).

Edit: link didn't work, so separated it out.

Comment author: army1987 18 May 2012 07:17:59PM *  1 point [-]

Edit: link didn't work, so separated it out.

Have you tried putting <http://> in front of the URL?

(Edit: the backtick thing to show verbatim code isn't working properly for some reason, but you know what I mean.)

Comment author: TimS 18 May 2012 07:22:08PM *  1 point [-]

moral realism.

Edit: Apparently that was the problem. Thanks.

Edit2: It appears that copying and pasting from some places includes "http" even when my browser address doesn't. But I did something wrong when copying from the philosophy dictionary.

Comment author: [deleted] 18 May 2012 04:22:41PM -1 points [-]

I agree -- assuming that CEV didn't impose a majority view on a minority. My understanding of the SI's arguments (and it's only my understanding) is that they believe it will impose a majority view on a minority, but that they think that would be the right thing to do -- that if the choice were beween 3^^^3 people getting a dustspeck in the eye or one person getting tortured for fifty years, the FAI would always make a choice, and that choice would be for the torture rather than the dustspecks.

Now, this may well be, overall, the rational choice to make as far as humanity as a whole goes, but it would most definitely not be the rational choice for the person who was getting tortured to support it.

And since, as far as I can see, most people only value a very small subset of humanity who identify as belonging to the same groups as them, I strongly suspect that in the utilitarian calculations of a "friendly" AI programmed with CEV, they would end up in the getting-tortured group, rather than the avoiding-dustspecks one.

Comment author: TheOtherDave 18 May 2012 04:33:15PM 1 point [-]

but it would most definitely not be the rational choice for the person who was getting tortured to support it.

This is not clear.

Comment author: TimS 18 May 2012 04:39:56PM *  0 points [-]

that if the choice were beween 3^^^3 people getting a dustspeck in the eye or one person getting tortured for fifty years, the FAI would always make a choice, and that choice would be for the torture rather than the dustspecks

That is an entirely separate issue. If CEV(everyone) outputted a moral theory that held utility was additive, then the AI implementing it would choose torture over specks. In other words, utilitarians are committed to believing that specks is the wrong choice.

But there is no guarantee that CEV will output a utilitarian theory, even if you believe it will output something. SI (Eliezer, at least) believes CEV will output a utilitarian theory because SI believes utilitarian theories are right. But everyone agrees that "whether CEV will output something" is a different issue than "what CEV will output."

Personally, I suspect CEV(everyone in the United States) would output something deotological - and might even output something that would pick specks. Again, assuming it outputs anything.

Comment author: thomblake 18 May 2012 04:45:20PM 0 points [-]

Either people have fundamentally different, irreconcilable, values or they don't. If they do, then the argument I made is valid. If they don't, then CEV(any random person) will give exactly the same result as CEV(humanity).

This is a false dilemma. If people have some values that are the same or reconcilable, then you will get different output from CEV(any random person) and CEV(humanity).

And note that an actual move by virtue ethicists is to exclude sociopaths from "humanity".

Comment author: DanArmak 19 May 2012 04:14:17PM 1 point [-]

TimS claimed that in a situation like that, CEV would do nothing, rather than impose the extrapolated will of the majority.

I agree with you in general, and want to further point out that there is no such thing as "doing nothing". If doing nothing tends to allow your friends to continue living (because they have the power to defend themselves in the status quo), that is favoring their values. If doing nothing tends to allow your friends to be killed (because they are a powerless, persecuted minority in the status quo) that is favoring the other people's values.

Comment author: TheOtherDave 18 May 2012 03:40:04PM 0 points [-]

Of course, a lot depends on what we're willing to consider a minority as opposed to something outside the set of things being considered at all.

E.g., I'm in a discussion elsethread with someone who I think would argue that if we ran CEV on the set of things capable of moral judgments, it would not include psychopaths in the first place, because psychopaths are incapable of moral judgments.

I disagree with this on several levels, but my point is simply that there's an implicit assumption in your argument that terms like "person" have shared referents in this context, and I'm not sure they do.

Comment author: [deleted] 18 May 2012 03:59:04PM -1 points [-]

In which case we wouldn't be talking about CEV(humanity) but CEV(that subset of humanity which already share our values), where "our values" in this case includes excluding a load of people from humanity before you start. Psychopaths may or may not be capable of moral judgements, but they certainly have preferences, and would certainly find living in a world where all their preferences are discounted as intolerable as the rest of us would find living in a world where only their preferences counted.

Comment author: TheOtherDave 18 May 2012 04:15:31PM *  2 points [-]

I agree that psychopaths have preferences, and would find living in a world that anti-implemented their preferences intolerable.

In which case we wouldn't be talking about CEV(humanity) but CEV(that subset of humanity which already share our values),

If you mean to suggest that the fact that the former phrase gets used in place of the latter is compelling evidence that we all agree about who to include, I disagree.

If you mean to suggest that it would be more accurate to use the latter phrase when that's what we mean, I agree.

Ditto "CEV(that set of preference-havers which value X, Y, and Z)".

Comment author: [deleted] 18 May 2012 04:25:16PM -1 points [-]

I definitely meant the second interpretation of that phrase.

Comment author: TimS 18 May 2012 04:45:05PM 2 points [-]

I hope that everyone who discusses CEV understands that a very hard part of building a CEV function would be defining the criteria for inclusion in the subset of people whose values are considered. It's almost circular, because figuring out who to exclude as "insufficiently moral" almost inherently requires the output of a CEV-like function to process.

Comment author: tut 25 May 2012 04:56:35PM *  0 points [-]

'Coherent' in CEV means that it makes up a coherent value system for all of humanity. By definition that means that there will be no value conflicts in CEV. But it does not mean that you will necessarily like it.

Comment author: dlthomas 18 May 2012 05:19:07PM 2 points [-]

Um, if you would object to your friends being killed (even if you knew more, thought faster, and grew up further with others), then it wouldn't be coherent to value killing them.

Comment author: [deleted] 18 May 2012 05:24:06PM 2 points [-]

Just because I wouldn't value that, doesn't mean that the majority of the world wouldn't. Which is my whole point.

Comment author: dlthomas 18 May 2012 05:28:58PM 2 points [-]

My understanding is that CEV is based on consensus, in which case the majority is meaningless.

Comment author: steven0461 18 May 2012 08:39:01PM *  6 points [-]

Some quotes from the CEV document:

Coherence is not a simple question of a majority vote. Coherence will reflect the balance, concentration, and strength of individual volitions. A minor, muddled preference of 60% of humanity might be countered by a strong, unmuddled preference of 10% of humanity. The variables are quantitative, not qualitative.

(...)

It should be easier to counter coherence than to create coherence.

(...)

In qualitative terms, our unimaginably alien, powerful, and humane future selves should have a strong ability to say "Wait! Stop! You're going to predictably regret that!", but we should require much higher standards of predictability and coherence before we trust the extrapolation that says "Do this specific positive thing, even if you can't comprehend why."

Though it's not clear to me how the document would deal with Wei Dai's point in the sibling comment. In the absence of coherence on the question of whether to protect, persecute, or ignore impopular minority groups, does CEV default to protecting them or ignoring them? You might say that as written, it would obviously not protect them, because there was no coherence in favor of doing so; but what if protection of minority groups is a side effect of other measures CEV was taking anyway?

(For what it's worth, I suspect that extrapolation would in fact create enough coherence for this particular scenario not to be a problem.)

Comment author: dlthomas 18 May 2012 08:56:29PM 0 points [-]

Thank you. So, not quite consensus but similarly biased in favor if inaction.

Comment author: Wei_Dai 18 May 2012 06:40:56PM 4 points [-]

My understanding is that CEV is based on consensus, in which case the majority is meaningless.

If CEV doesn't positively value some minority group not being killed (i.e., if it's just indifferent due to not having a consensus), then the majority would be free to try to kill that group. So we really do need CEV to saying something about this, instead of nothing.

Comment author: dlthomas 18 May 2012 06:42:45PM 0 points [-]

Assuming we have no other checks on behavior, yes. I'm not sure, pending more reflection, whether that's a fair assumption or not...

Comment author: DanArmak 19 May 2012 04:00:28PM 2 points [-]

There is absolutely no reason to think that the values of all humans, extrapolated in some way, will arrive at a consensus.

Comment author: 4hodmt 18 May 2012 05:41:55PM 1 point [-]

Why do we need a single CEV value system? A FAI can calculate as many value systems as it needs and keep incompatible humans separate. Group size is just another parameter to optimize. Religious fundamentalists can live in their own simulated universe, liberals in another.

Comment author: TheOtherDave 18 May 2012 06:21:01PM 9 points [-]

Upvoting back to zero because I think this is an important question to address.

If I prefer that people not be tortured, and that's more important to me than anything else, then I ought not prefer a system that puts all the torturers in their own part of the world where I don't have to interact with them over a system that prevents them from torturing.

More generally, this strategy only works if there's nothing I prefer/antiprefer exist, but merely things that I prefer/antiprefer to be aware of.

Comment author: dlthomas 18 May 2012 06:26:43PM 0 points [-]

It's a potential outcome, I suppose, in that

[T]here's nothing I prefer/antiprefer exist, but merely things that I prefer/antiprefer to be aware of.

is a conceivable extrapolation from a starting point where you antiprefer something's existence (in the extreme, with MWI you may not have much say what does/doesn't exist, just how much of it in which branches).

It's also possible that you hold both preferences (prefer X not exist, prefer not to be aware of X) and the existence preference gets dropped for being incompatible with other values held by other people while the awareness preference does not.

Comment author: TimS 18 May 2012 05:56:17PM 7 points [-]

The child molester cluster (where they grow child simply to molest them, then kill them) doesn't bother you, even if you never interact with it?

Because I'm fairly certain I wouldn't like what CEV(child molester) would output and wouldn't want an AI to implement it.

Comment author: 4hodmt 18 May 2012 06:25:57PM 1 point [-]

Assuming 100% isolation it would be indistinguishable from living in a universe where the Many Worlds Interpretation is true, but it still seems wrong. The FAI could consider avoiding groups whose even theoretical existence could cause offence, but I don't see any good way to assign weight to this optimization pressure.

Even so, I think splitting humanity into multiple groups is likely to be a better outcome than a single group. I don't consider the "failed utopia" described in http://lesswrong.com/lw/xu/failed_utopia_42/ to be particularly bad.

Comment author: Sewing-Machine 18 May 2012 06:32:29PM 3 points [-]

Assuming 100% isolation it would be indistinguishable from living in a universe where the Many Worlds Interpretation is true

Well, not if "child-molesters" and "non-child-molestors" are competing for limited resources.

Comment author: TimS 18 May 2012 06:42:55PM 0 points [-]

The failed utopia is better than our current world, certainly. But the genie isn't Friendly.

In principle, I could interact with the immoral cluster. AI's interference is not relevant to the morality of the situation because I was part of the creation of the AI. Otherwise, I would be morally justified in ignoring the suffering in some distant part of the world because it will have no practical impact on my life. By contrast, I simply cannot interact with other branches under the MWI - it's a baked in property of the universe that I never had any input into.

Comment author: Sewing-Machine 18 May 2012 06:00:50PM 4 points [-]

What if space travel turns out to be impossible, and the superintelligence has to allocate the limited computational resources of the solar system?

Comment author: drnickbone 18 May 2012 05:12:35PM *  1 point [-]

Wouldn't CEV need to extract consensus values under a Rawlsian "veil of ignorance"?

It strikes me as very unlikely that there would be a consensus (or even majority) vote for killing gays or denying full rights to women under such a veil, because of the significant probability of ending up gay, and the more than 50% probability of being a woman. Prisons would be a lot better as well. The only reason illiberal values persist is because those who hold them know (or are confident) that they're not personally going to be victims of them.

So CEV is either going to end up very liberal, or if done without the veil of ignorance, is not going to end up coherent at all. Sorry if that's politics, the mind-killer.

Comment author: TheOtherDave 18 May 2012 05:40:20PM *  6 points [-]

Note that there's nothing physically impossible about altering the probability of being born gay, straight, bi, male, female, asexual, etc.

Comment author: drnickbone 18 May 2012 06:19:53PM 1 point [-]

True, and this could create some interesting choices for Rawlsians with very conservative values. Would they create a world with no gays, or no women? Would they do both???

Comment author: TheOtherDave 18 May 2012 06:22:49PM 6 points [-]

I don't know how to reply to this without violating the site's proscription on discussions of politics, which I prefer not to do.

Comment author: drnickbone 18 May 2012 09:25:25PM 2 points [-]

OK - the comment was pretty flippant anyway. Consider it withdrawn.

Comment author: TimS 18 May 2012 07:11:18PM *  4 points [-]

Heinlein's "Starship Troopers" discusses the death penalty imposed on a violent child rapist/murder. The narrator says there are two possibilities:

1) The killer was so deranged he didn't know right from wrong. In that case, killing (or imprisoning him) is the only safe solution for the rest. Or,
2) The killer knew right from wrong, but couldn't stop himself. Wouldn't killing (or stopping) him be a favor, something he would want?

Why can't that type of reasoning exist behind the veil of ignorance? Doesn't it completely justify certain kinds of oppression? That said, there's also an empirical question whether the argument applies to the particular group being oppressed.

Comment author: gwern 18 May 2012 08:25:12PM *  9 points [-]

Not dealing with your point, but that sort of analysis is why I find Heinlein so distasteful - the awful philosophy. For example in #1, 5 seconds of thought suffices to think of counterexamples like temporary derangements (drug use, treatable disease, particularly stressful circumstances, blows to the head), and more effort likely would turn up powerful empirical evidence like possibly an observation that most murderers do not murder again even after release (and obviously not execution).

Comment author: TimS 18 May 2012 08:42:12PM *  1 point [-]

Absolutely. What finally made me realize that Heinlein was not the bestest moral philosopher ever was noticing that all his books contained superheros - Stranger in a Strange Land is the best example. I'm not talking about the telekinetic powers, but the mental discipline. His moral theory might work for human-like creatures with perfect mental discipline, but for ordinary humans . . . not so much.

Comment author: fubarobfusco 19 May 2012 02:19:52AM 1 point [-]

This was pretty common in sf of the early 20th century, actually — the trope of a special group of people with unusual mental disciplines giving them super powers and special moral status. See A. E. van Vogt (the Null-A books) or Doc Smith (the Lensman books) for other examples. There's a reason Dianetics had so much success in the sf community of that era, I suspect — fans were primed for it.

Comment author: NancyLebovitz 20 May 2012 06:05:22AM *  1 point [-]

Is that true of all of Heinlein's books? I would say that most of them (including Starship Troopers) don't have superheroes.

Comment author: Nornagest 20 May 2012 06:28:50AM 1 point [-]

Well, I'm not exactly a Heinlein scholar, but I'd say it shows up mainly in his late-period work, post Stranger in a Strange Land. Time Enough for Love and its sequels definitely qualify, but some of the stuff he's most famous for -- The Moon is a Harsh Mistress, Have Space Suit, Will Travel, et cetera -- don't seem to. Unfortunately, Heinlein's reputation is based mainly on that later stuff.

Comment author: TimS 20 May 2012 09:00:02PM 0 points [-]

The revolution in "Moon is a Harsh Mistress" cannot succeed without the aid of the supercomputer. That makes any moral philosophy implicit in that revolution questionable to the extent one asserts that the moral philosophy is true of humanity now.

To a lesser extend, "Starship Troopers" asserts that military service is a reliable way of screening for the kinds of moral qualities (like mental discipline) that make one trustworthy enough to be a high government official (or even to vote, if I recall correctly). In reality, those moral qualities are very thin on the ground in the real world, being much less common than suggested by the book. If the appropriate moral qualities were really that frequent, the sanity line would already be much high than it is.

Comment author: dlthomas 18 May 2012 09:57:59PM 1 point [-]

As long as we're using sci-fi to inform our thinking on criminality and corrections, The Demolished Man is an interesting read.

Comment author: drnickbone 18 May 2012 09:35:48PM *  0 points [-]

What would a Rawlsian decider do? Institute a prison and psychiatric system, and some method of deciding between case 1 (psychiatric imprisonment to try and treat or at least prevent further harm) and case 2 (criminal imprisonment to deter like-minded people and prevent further harm from the killer/rapist). Also set up institutions for detecting and encouraging early treatment of child sex offenders before they moved to murder.

They would not want the death penalty in either case, nor would they want the prison/psychiatric system to be so appalling that they might prefer to be dead.

The Rawlsian would need to weigh the risk of being the raped/murdered child (or their parent) against the risk of being born with psychopathic or paedophile tendencies. If there was genuinely a significant deterrent from the death penalty, then the Rawlsian might accept it. But that looks unlikely in such cases.

Comment author: DanArmak 19 May 2012 04:05:49PM 2 points [-]

Just because some despised minorities exist today, doesn't mean they will continue to exist in the future under CEV. If a big enough majority clearly wishes that "no members of that group continue to exist" (e.g. kill existing gays AND no new ones ever to be born), then the CEV may implement that, and the veil of ignorance won't change this, because you can't be ignorant about being a minority member in a future where no-one is.

Comment author: Zack_M_Davis 18 May 2012 06:25:42PM 2 points [-]

The only reason illiberal values persist is because those who hold them know (or are confident) that they're not personally going to be victims of them.

You might be right, but I'm less sure of this.

Someone with more historical or anthropological knowledge than I is welcome to correct me, but I'm given to understand that many of those whom we would consider victims of an oppressive social system, actually support the system. (E.g., while woman's suffrage seems obvious now, there were many female anti-suffragists at the time.) It's likely that such sentiments would be nullified by a "knew more, thought faster, &c." extrapolation, but I don't want to be too confident about the output of an algorithm that is as yet entirely hypothetical.

Furthermore, the veil of ignorance has its own problems: what does it mean for someone to have possibly been someone else? To illustrate the problem, consider an argument that might be made by (our standard counterexample) a hypothetical agent who wants only to maximize the number of paperclips in the universe:

The only reason non-paperclip-maximizing values persist is because those who hold them know (or are confident) that they're not personally going to be victims of them (because they already know that they happened to have been born as humans rather than paperclip-maximizers).

---which does not seem convincing. Of course, humans in oppressed groups and humans in privileged groups are inexpressibly more similar to each other than humans are to paperclip-maximizers, but I still think this thought experiment highlights a methodological issue that proponents of a veil of ignorance would do well to address.

Comment author: drnickbone 18 May 2012 06:40:23PM *  3 points [-]

Someone with more historical or anthropological knowledge than I is welcome to correct me, but I'm given to understand that many of those whom we would consider victims of an oppressive social system, actually support the system.

Isn't the main evidence that victims of oppressive social systems want to escape from them at every opportunity? There are reasons for refugees, and reasons that the flows are in consistent directions.

And if anti-suffragism had been truly popular, then having got the vote, women would have immediately voted to take it away again. Does this make sense?

Some other points:

  1. CEV is about human values, and human choices, rather than paper-clippers. I doubt we'd get a CEV across wildly-different utility functions in the first place.

  2. I'm happy to admit that CEV might not exist in the veil of ignorance case either, but it seems more likely to.

  3. I'm getting a few down-votes here. Is the general consensus here that this is too close to politics, and that is a taboo subject (as it is a mind-killer)? Or is the "veil of ignorance" idea not an important part of CEV?

Comment author: Nornagest 18 May 2012 05:51:52PM 3 points [-]

Most of those who propose illiberal values do not do so under the presumption that they thereby harm the affected groups. A paternalistic attitude is much more common, and is not automatically inconsistent with preferences beyond a Rawlsian veil of ignorance.

An Omelasian attitude also seems consistent, for that matter, though even less likely.

Comment author: drnickbone 18 May 2012 06:11:21PM 2 points [-]

As a matter of empirical fact, I think this is wrong. Men in sexist societies are really glad they're not women (and even thank God they are not in some cases). They are likely to run in horror from the Rawlsian veil when they see the implications.

And anyway, isn't that paternalism itself inconsistent with Rawlsian ignorance? Who would voluntarily accept a more than 50% chance of being treated like a patronized child (and a second-class citizen) for life?

And how is killing gays in the slightest bit a paternalistic attitude?

I'd never heard of Omelas, or anything like it.. so I doubt this will be part of CEV. Again, who would voluntarily accept the risk of being such a scapegoat, if it were an avoidable risk? (If it is not avoidable for some reason, then that is a fact that CEV would have to take into account, as would the Rawlsian choosers).

Comment author: Nornagest 18 May 2012 07:09:34PM *  8 points [-]

Who would voluntarily accept a more than 50% chance of being treated like a patronized child (and a second-class citizen) for life?

Someone believing that this sort of paternalism is essential to gender and unable or unwilling to accept a society without it. Someone convinced that this was part of God's plan or otherwise metaphysically necessary. Someone not very fond of making independent decisions. I don't think any of these categories are strikingly rare.

That's about as specific as I'd like to get; anything more so would incur an unacceptable risk of political entanglements. In general, though, I think it's important to distinguish fears and hatreds arising against groups which happen to be on the wrong side of some social line (and therefore identity) from the processes that led to that line being drawn in the first place: it's possible, and IMO quite likely, for people to coherently support most traditional values concerning social dichotomies without coherently endorsing malice across them. This might not end up being stable, human psychology being what it is, but it doesn't seem internally inconsistent.

The way people's values intersect with the various consequences of their identities is quite complicated and I'm not sure I completely understand it, but I wouldn't describe either as a subset of the other.

(Incidentally, around 51% of human births are male; more living humans are female but that's because women live longer. This has absolutely no bearing on the argument, but it was bugging me.)

Comment author: drnickbone 18 May 2012 09:57:14PM *  1 point [-]

Thanks for the reply here, that was helpful.

What you've described here is a person who would put adherence to an ideological system (or set of values derived from that system) above their own probable welfare. They would reason to themselves : yes my own personal welfare would probably be higher in an egalitarian society (or the risk of low personal welfare would be lower); but stuff that, I'm going to implement my current value system anyway. Even if it comes back to shoot me in the foot.

I agree that's possible, but my impression is that very few humans would really want to do that. The tendency to put personal welfare first is enormous, and I really do believe that most of us would do that if choosing behind a Rawlsian veil.

What's odd is that it is a classical conservative insight that human beings are mostly self-interested, and rather risk-adverse, and that society needs to be constructed to take that into account. It's an insight I agree with by the way, and yet it is precisely this insight that leads to Rawlsian liberalism. Whereas to choose a different (conservative) value system, the choosers have to sacrifice their self-interest to that value system.

Comment author: Nornagest 18 May 2012 10:13:35PM *  5 points [-]

What you've described here is a person who would put adherence to an ideological system (or set of values derived from that system) above their own probable welfare.

Self-assessed welfare isn't cleanly separable from ideology. People aren't strict happiness maximizers; we value all sorts of abstract things, many of which are linked to the social systems and identity groups in which we're embedded. Sometimes this ends up looking pretty irrational from the outside view, but from the inside giving them up would look unattractive for more or less the same reason that wireheading is unattractive to (most of) us.

Now, this does drift over time, both on a sort of random walk and in response to environmental pressures, which is what allows things like sexual revolutions to happen. During phase changes in this space, the affected social dichotomies are valued primarily in terms of avoiding social costs; that's the usual time when they're a salient issue instead of just part of the cultural background, and so it's easy to imagine that that's always what drives them. But I don't think that's the case; I think there's a large region of value space where they really are treated as intrinsic to welfare, or as first-order consequences of intrinsic values.

Comment author: drnickbone 19 May 2012 05:28:46PM 0 points [-]

Thanks again. I'm still not sure of the exact point you are making here, though.

Let's take gender-based discrimination and unequal rights as a sample case. Are you arguing that someone wedded to an existing gender-biased value system would deliberately select a discriminatory society (over an equal rights one) even if they were choosing on the basis of self-interest? That they would fully understand that they have roughly 50% chance of getting the raw end of the deal, but still think that this deal would maximise their welfare overall?

I get the point that a committed ideologue could consciously decide here against self-interest. I'm less clear how someone could decide that way while still thinking it was in their self-interest. The only way I can make sense of such a decision is if were made on the basis of faulty understanding (i.e. they really can't empathize very well, and think it would not be so bad after all to get born female in such a society).

In a separate post, I suggested a way that an AI could make the Rawlsian thought experiment real, by creating a simulated society to the deciders' specifications, and then beaming them into roles in the simulation at random (via virtual reality/total immersion/direct neural interface or whatever). One variant to correct for faulty understanding might be to do it on an experimental basis. Once the choosers think they have made their minds up, they get beamed into a few randomly-selected folks in the sim, maybe for a few days or weeks (or years) at a time. After the experience of living in their chosen world for a while, in different places, times, roles etc. they are then asked if they want to change their mind. The AI will repeat until there is a stable preference, and then beam in permanently.

Returning to the root of the thread, the original objection to CEV was that most people alive today believe in unequal rights for women and essentially no rights for gays. The key question is therefore whether most people would really choose such a world in the Rawlsian set-up. And then, would most people continue to so-choose even after living in that world for a while in different roles?

If the answers are "no" then the Rawlsian veil of ignorance can remove this particular objection to CEV. If they are "yes" then it cannot. Agreed?

Comment author: Nornagest 19 May 2012 10:02:28PM *  2 points [-]

Are you arguing that someone wedded to an existing gender-biased value system would deliberately select a discriminatory society (over an equal rights one) even if they were choosing on the basis of self-interest? That they would fully understand that they have roughly 50% chance of getting the raw end of the deal, but still think that this deal would maximise their welfare overall?

Yes and no. Someone who'd internalized a discriminatory value system -- who really believed in it, not just belief-in-belief, to use LW terminology -- would interpret their self-interest and therefore their welfare in terms of that value system. They would be conscious of of what we would view as unequal rights, but would see these as neutral or positive on both sides, not as one "getting the raw end of the deal" -- though they'd likely object to some of their operational consequences. This implies, of course, a certain essentialism, and only applies to certain forms of discrimination: recent top-down imposition of values isn't stable in this way.

As a toy example, read 1 Corinthians 11, and try to think of the mentality implied by taking that as the literal word of God -- not just advice from some vague authority, but an independent axiom of a value system backed by the most potent proofs imaginable. Applied to an egalitarian society, what would such a value system say about the (value-subjective) welfare of the women -- or for that matter the men -- in it?

the original objection to CEV was that most people alive today believe in unequal rights for women and essentially no rights for gays. The key question is therefore whether most people would really choose such a world in the Rawlsian set-up. And then, would most people continue to so-choose even after living in that world for a while in different roles?

This, on the other hand, is essentially an anthropology question. The answer depends on the extent of discriminatory traditional cultures, on the strength of belief in them, and on the commonalities between them: "unequal rights" isn't a value, it's a judgment call over a value system, and the specific unequal values that we object to may be quite different between cultures. I'm not an anthropologist, so I can't really answer that question -- but if I had to, I'd doubt that a reflectively stable consensus exists for egalitarianism or for any particular form of discrimination, with or without the Rawlsian wrinkle.

Comment author: drnickbone 19 May 2012 10:32:39PM *  0 points [-]

So this would be like the "separate but equal" argument? To paraphrase in a gender context: "Men and women are very different, and need to be treated differently under the law - both human and divine law. But it's not like the female side is really worse off because of this different treatment".

That - I think - would count as a rather basic factual misunderstanding of how discrimination really works. It ought to be correctable pretty damn fast by a trip into the simulator.

(Incidentally, I grew up in a fundamentalist church until my teens, and one of the things I remember clearly was the women and teen girls being very upset about being told that they had to shut up in church, or wear hats or long hair, or that they couldn't be elders, or whatever. They also really hated having St Paul and the Corinthians thrown at them; the ones who believed in Bible inerrancy were sure the original Greek said something different, and that the sacred text was being misinterpreted and spun against them. Since it is an absolute precondition for an inerrantist position that correct interpretations are difficult, and up for grabs, this was no more unreasonable than the version spouted by the all-male elders.)

Comment author: evand 19 May 2012 07:34:15PM 2 points [-]

The obvious (paternalistic) answer is that they believe that, conditioned on them being born female, their self-interest is improved by paternalistic treatment of all women vs equality.

In order to convince them otherwise, you would (at a minimum) have to run multiple world sims, not just multiple placements in one world. You would also have to forcibly give them sufficiently rational thought processes that they could interpret the evidence you forced upon them. I'm not sure that forcibly messing with people's thought processes is ethical, or that you could really claim it was a coherent extrapolation after you had performed that much involuntary mind surgery on them.

Comment author: drnickbone 19 May 2012 08:16:20PM *  0 points [-]

In order to convince them otherwise, you would (at a minimum) have to run multiple world sims, not just multiple placements in one world. You would also have to forcibly give them sufficiently rational thought processes that they could interpret the evidence you forced upon them

Disagree. A simple classroom lesson is often sufficient to get the point across:

http://www.uen.org/Lessonplan/preview.cgi?LPid=536

Discrimination REALLY sucks.

Comment author: NancyLebovitz 20 May 2012 06:18:19AM 2 points [-]

That they would fully understand that they have roughly 50% chance of getting the raw end of the deal, but still think that this deal would maximise their welfare overall?

A lot of oppression of women seems to be justified by claims that if women aren't second-class citizens, they won't choose to have children, or at least not enough children for replacement. This makes women's rights into an existential risk.

Comment author: DanArmak 22 May 2012 01:37:27PM *  0 points [-]

This argument also implies that societies and smaller groups where women have lower status and more children will out-breed and so eventually outcompete societies where women have equal rights. So people can also defend the lower status of women as a nationalistic or cultural self-defense impulse.

Comment author: JoachimSchipper 19 May 2012 03:30:32PM 3 points [-]

I've met women who honestly and persistently profess that women should not be allowed to vote. In at least one case, even in private, to a person they really want to like them and who very clearly disagrees with them.

Comment author: drnickbone 19 May 2012 05:59:17PM *  1 point [-]

That doesn't surprise me... I've had the same experience once or twice, in mixed company, and with strong feminists in the room. The subsequent conversations along the lines of "But women chained themselves to railings, and threw themselves under horses to get the vote; how can you betray them like that?" were quite amusing. Especially when followed by the retort "Well I've got a right to my own opinion just as much as anyone else - surely you respect that as a feminist!"

I've also met quite a few people who think that no-one should vote. ("If it did any good, it would have been abolished years ago" a position I have a lot more sympathy for these days than I ever used to).

My preferred society (in a Rawlsian setting) might not actually have much voting at all, except on key constitutional issues. State and national political offices (parliaments, presidents etc) would be filled at random (in an analogue to jury service) and for a limited time period. After the victims had passed a few laws and a budget, they would be allowed to go home again. No-one would give a damn about gaffes, going off message, or the odd sex scandal, because it would happen all the time, and have very limited impact. I think there would also need to be mandatory citizen service on boring committees, local government roles, planning permission and drainage enquiries etc to stop professional civil servants, lobbyists or wonks ruling the roost: the necessary tedium would be considered part of everyone's civic duty. This - in my opinion - is probably the biggest problem with politics. Much of it is so dull, or soul-destroying, that no-one with any sense wants to do it, so it is left to those without any sense.

Comment author: DanArmak 19 May 2012 04:08:05PM 2 points [-]

And how is killing gays in the slightest bit a paternalistic attitude?

Kill their bodies, save their souls.

Comment author: TimS 18 May 2012 05:45:40PM *  1 point [-]

Isn't there substantial disagreement over whether the veil of ignorance is sufficient or necessary to justify a moral theory?

Edit: Or just read what Nornagest said

Comment author: drnickbone 18 May 2012 06:16:10PM 0 points [-]

Perhaps, but I think my point stands. CEV will use a veil of ignorance, or it won't be coherent. It may be incoherent with the veil as well, but I doubt it. Real human beings look after number one much more than they'd ever care to admit, and won't take stupid risks when choosing under the veil.

One very intriguing thought about an AI is that it could make the Rawlsian choice a real one. Create a simulated society to the choosers' preferences, and then beam them in at random...

Comment author: NancyLebovitz 20 May 2012 06:07:30AM *  0 points [-]

Even with a veil of ignorance, people won't make the same choices-- people fall in different places on the risk aversion/reward-seeking spectrum.

Comment author: TheOtherDave 18 May 2012 01:33:02AM 4 points [-]

Thanks for tying these together.

I would love to hear someone who believes in the in-principle viability of performing a bottom-up extrapolation of human values into a coherent whole that can be implemented by a system vastly different from a human in a way I ought to endorse make a case for that viability that addresses these concerns specifically; while I don't fully agree with everything said here, it captures much of my own skepticism about that viability much more coherently than I've been able to express it myself .

Comment author: TimS 18 May 2012 01:02:50PM *  0 points [-]

that can be implemented by a system vastly different from a human in a way I ought to endorse

Why do you think this part is difficult? If there are any coherent human value systems, then it seems very plausible (if difficult to build) for any agent to implement the value system, even if the agent isn't human.

Put slightly differently, my objection to the possibility of a friendly-to-Catholicism AI is that Catholicism (like basically all human value systems) is not coherent. If it were proved coherent, I would agree that it was possible to build an AI that followed it (obviously, I'd personally oppose building such an AI - it would be an act of violence against me)

Comment author: TheOtherDave 18 May 2012 02:21:08PM 0 points [-]

I don't mean to imply that, given that we've performed a bottom-up extrapolation of human values into a coherent whole, that implementing that whole in a system vastly different from a human is necessarily difficult.

Indeed, by comparison to the first part, it's almost undoubtedly trivial, as you suggest.

Rather, I mean that what is at issue is extrapolating the currently instantiated value systems into "a coherent whole that can be implemented by a system vastly different from a human".

That said, I do think it's worthwhile to distinguish between "Catholicism" and "the result of extrapolating Catholicism into a coherent whole." The latter, supposing it existed, might not qualify as an example of the former. The same is true of "human value".

Comment author: TimS 18 May 2012 01:35:48AM 5 points [-]

the majority of humans today think that women should not have full rights, homosexuals should be killed or at least severely persecuted, and nerds should be given wedgies. These are not incompletely-extrapolated values that will change with more information; they are values. Opponents of gay marriage make it clear that they do not object to gay marriage based on a long-range utilitarian calculation; they directly value not allowing gays to marry.

Without endorsing the remainder of your argument, I agree that these observations must be adequately explained, and rejection of the conclusions well justified - or the concept of provably Friendly AI must be considered impossible.

Comment author: jacobt 18 May 2012 07:28:54AM 3 points [-]

The human problem: This argues that the qualia and values we have now are only the beginning of those that could evolve in the universe, and that ensuring that we maximize human values - or any existing value set - from now on, will stop this process in its tracks, and prevent anything better from ever evolving. This is the most-important objection of all.

If you can convince people that something is better than present human values, then CEV will implement these new values. I mean, if you just took CEV(PhilGoetz), and you have the desire to see the universe adopt "evolved" values, then CEV will extrapolate this desire. The only issue is that other people might not share this desire, even when extrapolated. In that case insisting that values "evolve" is imposing minority desires on everyone, mostly people who could never be convinced that these values are good. Which might be a good thing, but it can be handled in CEV by taking CEV(some "progressive" subset of humans).

Comment author: cousin_it 18 May 2012 08:19:37AM *  9 points [-]

This seems a nice place to link to Marcello's objection to CEV, which says you might be able to convince people of pretty much anything, depending on the order of arguments.

Comment author: torekp 28 May 2012 12:04:15AM 0 points [-]

I think Marcello's objection dissolves when the subject becomes aware of the order-of-arguments effects. After all, those effects are part of the factual information that the subject considers in refining its values. Most people don't like to have values that change depending on the order in which arguments are presented, so they will reflect further until they each find a stable value set. At least, that would be my hypothesis.

Comment author: gRR 18 May 2012 01:17:35PM -1 points [-]

I think it would be impossible to convince people (assuming suitably extrapolated intelligence and knowledge) that total obliteration of all life on Earth is a good thing, no matter the order of arguments. And this is a very good value for a FAI. If it optimizes this (saves life) and otherwise interferes the least, it already done excellent.

Comment author: thomblake 18 May 2012 01:18:46PM 3 points [-]

There are nihilists who at least claim that position.

Comment author: DanArmak 19 May 2012 03:31:09PM 0 points [-]

Lots of people honestly wish for the literal end of the universe to come, because they believe in an afterlife/prophecy/etc.

You might say they would change their minds given better or more knowledge (e.g. that there is no afterlife and the prophecy was false/fake/wrong). But such people are often exposed to such arguments and reject them; and they make great efforts to preserve their current beliefs in the face of evidence. And they say these beliefs are very important to them.

There may well be methods of "converting" them anyway, but how are these methods ethically or practically different from "forcibly changing their minds" or their values? And if you're OK with forcibly changing their minds, why do you think that's ethically better than just ignoring them and building a partial-CEV that only extrapolates your own wishes and those of people similar to yourself?

Comment author: gRR 19 May 2012 04:29:22PM 1 point [-]

how are these methods ethically or practically different from "forcibly changing their minds" or their values?

I (and CEV) do not propose changing their minds or their values. What happens is that their current values (as modeled within FAI) get corrected in the presence of truer knowledge and lots of intelligence, and these corrected values are used for guiding the FAI.

If someone's mind & values are so closed as to be unextrapolateable - completely incompatible with truth - then I'm ok with ignoring these particular persons. But I don't believe there are actually any such people.

Comment author: DanArmak 19 May 2012 05:03:57PM 0 points [-]

I (and CEV) do not propose changing their minds or their values. What happens is that their current values (as modeled within FAI) get corrected in the presence of truer knowledge and lots of intelligence, and these corrected values are used for guiding the FAI.

So the future is built to optimize different values. And their original values aren't changed. Wouldn't they suffer living in such a future?

Comment author: thomblake 18 May 2012 01:44:31PM 0 points [-]

If it optimizes this (saves life) and otherwise interferes the least, it already done excellent.

I think the standard sort of response for this is The Hidden Complexity of Wishes. Just off the top of my (non-superintelligent) head, the AI could notice a method for near-perfect continuation of life by preserving some bacteria at the cost of all other life forms.

Comment author: gRR 18 May 2012 02:19:13PM 0 points [-]

I did not mean the comment that literally. Dropped too many steps for brevity, thought they were clear, I apologize.

It would be just as impossible (or even more impossible) to convince people that total obliteration of people is a good thing. On the other hand, people don't care much about bacteria, even whole species of them, and as long as a few specimens remain in laboratories, people will be ok about the rest being obliterated.

Comment author: thomblake 18 May 2012 02:30:26PM 4 points [-]

It would be just as impossible (or even more impossible) to convince people that total obliteration of people is a good thing.

There are lots of people who do think that's a good thing, and I don't think those people are trolling or particularly insane. There are entire communities where people have sterilized themselves as part of a mission to end humanity (for the sake of Nature, or whatever).

Comment author: gRR 18 May 2012 02:43:51PM *  0 points [-]

I think those people do have insufficient knowledge and intelligence. For example, the skoptsy sect, who believed they followed the God's will, were, presumably, factually wrong. And people who want to end humanity for the sake of Nature, want that instrumentally - because they believe that otherwise Nature will be destroyed. Assuming FAI is created, this belief is also probably wrong.

You're right in there being people who would place "all non-intelligent life" before "all people", if there was such a choice. But that does not mean they would choose "non-intelligent life" before "non-intelligent life + people".

Comment author: TheOtherDave 18 May 2012 03:14:11PM 3 points [-]

people who want to end humanity for the sake of Nature, want that instrumentally - because they believe that otherwise Nature will be destroyed. Assuming FAI is created, this belief is also probably wrong.

That depends a lot on what I understand Nature to be.
If Nature is something incompatible with artificial structuring, then as soon as a superhuman optimizing system structures my environment, Nature has been destroyed... no matter how many trees and flowers and so forth are left.

Personally, I think caring about Nature as something independent of "trees and flowers and so forth" is kind of goofy, but there do seem to be people who care about that sort of thing.

Comment author: [deleted] 18 May 2012 10:31:52PM 3 points [-]

What if particular arrangements of flowers, trees and soforth are complex and interconnected, in ways that can be undone to the net detriment of said flowers, trees and soforth? Thinking here of attempts at scientifically "managing" forest resources in Germany with the goal of making them as accessible and productive as possible. The resulting tree farms were far less resistant to disease, climatic abberation, and so on, and generally not very healthy, because it turns out that illegible, sloppy factor that made forest seem less-conveniently organized for human uses was a non-negligible portion of what allowed them to be so productive and robust in the first place.

No individual tree or flower is all that important, but the arrangement is, and you can easily destroy it without necessarily destroying any particular tree or flower. I'm not sure what to call this, and it's definitely not independent of the trees and flowers and soforth, but it can be destroyed to the concrete and demonstrable detriment of what's left.

Comment author: Nornagest 18 May 2012 10:47:25PM *  2 points [-]

That's an interesting question, actually.

I don't know forestry from my elbow, but I used to read a blog by someone who was pretty into saltwater fish tanks. Now, one property of these tanks is that they're really sensitive to a bunch of feedback loops that can most easily be stabilized by approximating a wild reef environment; if you get the lighting or the chemical balance of the water wrong, or if you don't get a well-balanced polyculture of fish and corals and random invertebrates going, the whole system has a tendency to go out of whack and die.

This can be managed to some extent with active modification of the tank, and the health of your tank can be described in terms of how often you need to tweak it. Supposing you get the balance just right, so that you only need to provide the right energy inputs and your tank will live forever: is that Nature? It certainly seems to have the factors that your ersatz German forest lacks, but it's still basically two hundred enclosed gallons of salt water hooked up to an aeration system.

Comment author: NancyLebovitz 20 May 2012 07:32:53AM 1 point [-]

That's something like my objection to CEV-- I currently believe that some fraction of important knowledge is gained by blundering around and (or?) that the universe is very much more complex than any possible theory about it.

This means that you can't fully know what your improved (by what standard?) self is going to be like.

Comment author: TheOtherDave 18 May 2012 11:39:19PM 0 points [-]

I'm not quite sure what you mean to ask by the question. If maintaining a particular arrangement of flowers, trees and so forth significantly helps preserve their health relative to other things I might do, and I value their health, then I ought to maintain that arrangement.

Comment author: [deleted] 18 May 2012 10:24:23PM 1 point [-]

Assuming FAI is created, this belief is also probably wrong.

Not that I'm a proponent of voluntary human extinction, but that's an awfully big conditional.

Comment author: Dolores1984 18 May 2012 10:39:18PM 5 points [-]

It's not even strictly true. It's entirely conceivable that FAI will lead to the Sol system being converted into a big block of computronium to run human brain simulations. Even if those simulations have trees and animals in them, I think that still counts as the destruction of nature.

Comment author: gRR 19 May 2012 12:27:54AM *  2 points [-]

But if FAI is based on CEV, then this will only happen if this is the extrapolated wish of everybody. Assuming existence of people truly (even after extrapolation) valuing Nature in its original form, such computroniums won't be forcefully built.

Comment author: gRR 19 May 2012 12:24:01AM 0 points [-]

But it's the only relevant one, when we're talking about CEV. CEV is only useful if FAI is created, so we can take it for granted.

Comment author: Cyan 20 May 2012 01:19:57AM 1 point [-]

I did not mean the comment that literally. Dropped too many steps for brevity, thought they were clear, I apologize.

Ah, the FAI problem in a nutshell.

Comment author: Nick_Beckstead 18 May 2012 02:26:07AM *  3 points [-]

This link

Values vs. parameters: Eliezer has suggested using...

is broken.

Comment author: PhilGoetz 18 May 2012 03:14:14AM 0 points [-]

Fixed; thanks.

Comment author: timtyler 19 May 2012 12:30:25AM 2 points [-]

I wanted to write about my opinion that human values can't be divided into final values and instrumental values, the way discussion of FAI presumes they can. This is an idea that comes from mathematics, symbolic logic, and classical AI. A symbolic approach would probably make proving safety easier. But human brains don't work that way. You can and do change your values over time, because you don't really have terminal values.

You may have wanted to - but AFAICS, you didn't - apart from this paragraph. It seems to me that it fails to make its case. The split applies to any goal-directed agent, irrespective of implemetation details.

Comment author: Nornagest 18 May 2012 05:44:48PM 1 point [-]

The link to your group selection update seems broken. Looks like it's got an extra lesswrong.com/ in it.

Comment author: PhilGoetz 20 May 2012 02:34:09AM 0 points [-]

Thanks; fixed.

Comment author: Mitchell_Porter 18 May 2012 03:55:54AM 1 point [-]

Do you think an AI reasoning about ethics would be capable of coming to your conclusions? And what "superintelligence policy" do you think it would recommend?

Comment author: PhilGoetz 18 May 2012 04:33:49AM *  1 point [-]

I'm pretty sure that FAI+CEV is supposed to prevent exactly this scenario, in which an AI is allowed to come to its own, non-foreordained conclusions

Comment author: thomblake 18 May 2012 01:08:07PM 2 points [-]

FAI is supposed to come to whatever conclusions we would like it to come to (if we knew better etc.). It's not supposed to specify the whole of human value ahead of time, it's supposed to ensure that the FAI extrapolates the right stuff.

Comment author: Benvolio 02 July 2012 05:07:32AM 0 points [-]

I'm not sure if this is appropriate but like the original author I am unsure if a CEV is a thing that can be expressed in formal logic even if he brain were fully mapped into a virtual environment. A lot of how we craft our values are based on complex environmental factors that are not easily models. Please read Schall's "Disgust embodied as moral judgement" or J Greene's fMRI Investigation of Emotional Engagement in Moral Judgement. Our values are fluid and Non-Hierarchical . Developing values that have a strict hierarchy , as the OP says can lead to systems which can not change.

Comment author: Monkeymind 30 May 2012 05:11:29PM 0 points [-]

If the evolutionary process results in either convergence, divergence or extinction, and most often results in extinction, what reason(s) do I have to think that this 23rd emerging complex homo will not go the way of extinction also? Are we throwing all our hope towards super intelligence as our salvation?

Comment author: private_messaging 27 May 2012 03:22:09PM *  0 points [-]

The much stronger issue he raised is that it may well be that outside imagination and fiction, there is no monolithic 'intelligence' thing, and the 'benevolent ruler of the earth' software is then more dangerous than e.g. software that uses search and hill climbing to design better microchips, or design cures for diseases, or the like, without being 'intelligent' in the science fictional sense, and while lacking any form of real world volition. The "benevolent ruler of the earth" software would then, also, fail to provide any superior technical solutions to our problems, as this "intelligence" does not bring any important advantage over the algorithms normally used for problem solving.

The chip improver would spit out the blueprints, the cure designer would spit out the projected molecular images and DNA sequences, etc - no oracle crap with the 'utility' of making people understand something, which appears both near-impossible and entirely unnecessary.

Comment author: CuSithBell 27 May 2012 03:41:52PM 1 point [-]

Outside of mystic circles, it is fairly uncontroversial that it is in principle possible to construct out of matter an object capable of general intelligence. Proof is left to the reader.

Comment author: Monkeymind 24 May 2012 12:42:00PM *  0 points [-]

Humans have a values hierarchy. Trouble is, most do not even know what it is (or, they are). IOW, for me honesty is one of the most important values to have. Also, sanctity of (and protection of) life is very high on the list. I would lie in a second to save my son's life. Some choice like that are no-brainers, however few people know all the values that they live by, let alone the hierarchy. Often humans only discover what these values are as they find themselves in various situations.

Just wondering... has anyone compiled a list of these values, morals, ethics... and applied them to various real-life situations to study the possible 'choices' an AI has and the potential outcomes with differing hierarchies?

ADDED: Sometime humans know the right thing but choose to do something else. Isn't that because of emotion? If so, what part does emotion play in superintelligence?

Comment author: FinalState 23 May 2012 03:31:45PM *  0 points [-]

EDIT: To edit and simplify my thoughts, in order to get a General Intelligence Algorithm Instance to do anything requires masterful manipulation of parameters with full knowledge of generally how it is going to behave as a result. A level of understanding of psychology of all intelligent (and sub-intelligent) behavior. It is not feasible that someone would accidentally program something that would become an evil mastermind. GIA instances could easily be made to behave in a passive manner even when given affordances and output, kind of like a person that was happy to assist in any way possible because they were generally warm or high or something.

You can define the most important elements of human values for a GIA instance, because most of human values are a direct logical consequence of something that cannot be separated from the GIA... IE if general motivation X accidentally drove intelligence (see: Orthogonality Thesis ) and it also drove positive human values, then positive human values would be unavoidable. It is true that the specifics of body and environment drive some specific human values, but those are just side effects of X in that environment and X in different environments only changes so much and in predictable ways.

You can directly implant knowledge/reasoning into a GIA instance. The easiest way to do this is to train one under very controlled circumstances, and then copy the pattern. This reasoning would then condition the GIA instance's interpretation of future input. However, under conditions which directly disprove the value of that reasoning in obtaining X the GIA instance would un-integrate that pattern and reintegrate a new one. This can be influenced with parameter weights.

I suppose this could be a concern regarding the potential generation of an anger instinct. This HEAVILY depends on all the parameters however, and any outputs given to the GIA instance. Also, robots and computers do not have to eat, and have no associated instincts with killing things in order to do so... Nor do they have reproductive instincts...

Comment author: thomblake 23 May 2012 05:13:55PM 1 point [-]

It is true that the specifics of body and environment drive some specific human values, but those are just side effects of X in that environment and X in different environments only changes so much and in predictable ways.

When you say "predictable", do you mean in principle or actually predictable?

That is, are you claiming that you can predict what any human values given their environment, and furthermore that the environment can be easily and compactly specified?

Can you give an example?

Comment author: FinalState 24 May 2012 11:53:09AM 0 points [-]

Mathematically predictable but somewhat intractable without a faster running version of the instance, with the same frequency of input. Or predictable within ranges of some general rule.

Or just generally predictable with the level of understanding afforded to someone capable of making one in the first place, that for instance could describe the cause of just about any human psychological "disorder".

Comment author: TimS 23 May 2012 03:56:33PM 1 point [-]

Name three values all agents must have, and explain why they must have them.

Comment author: thomblake 18 May 2012 01:05:20PM 0 points [-]

This argues that the qualia and values we have now are only the beginning of those that could evolve in the universe, and that ensuring that we maximize human values - or any existing value set - from now on, will stop this process in its tracks, and prevent anything better from ever evolving.

This is unhelpfully circular. While it's not logically impossible for us to value values that we don't have, it's surely counterintuitive. What makes future values better?

Comment author: PhilGoetz 19 May 2012 12:36:03AM *  2 points [-]

I look at the past, and see that the dominant life forms have grown more complex and more interesting, and I expect this trend to continue. The best guide I have to what future life-forms will be like compared to me, if allowed to evolve naturally, is to consider what I am like compared to a fruit fly, or to bacteria.

If you object that of course I will value myself more highly than I value a bacterium, and that I fail to adequately respect bacterial values, I can compare an algae to an oak tree. The algae is more closely-related to me; yet I still consider the oak tree a grander life form, and would rather see a world with algae and oak trees than one with only algae.

(It's also possible that life does not naturally progress indefinitely, but that developing intelligence and societies inevitably leads to collapse and distinction. That would be an argument in favor of FAI, but it's a little farther down the road from where our thoughts are so far, I think.)

If you like, I can say that I value complexity, and then build an FAI that maximizes some complexity measure. That's what I meant when I said that I object less to FAI if you go meta. I know that some people in SIAI give this response, that I am not going meta enough if I'm not happy with FAI; but in their writings and discussions other than when dealing with that particular argument, they don't usually go that meta. Seriously adopting that view would result in discussions of what our high-level values really are, which I have not seen.

My attitude is, The universe was doing amazingly well before I got here; instead of trusting myself to do some incredibly complex philosophical work error-free, I should try to help it keep on doing what it's been doing, and just help it avoid getting trapped in a local maximum. Whereas the entire purpose of FAI is to trap the universe in a local maximum.

Comment author: Wei_Dai 19 May 2012 01:32:43AM 5 points [-]

Would it be fair to say that your philosophy is similar to davidad's? Both of you seem to ultimately value some hard-to-define measure of complexity. He thinks the best way to maximize complexity is to develop technology, whereas you think the best way is to preserve evolution.

I think that evolution will lead to a local maximum of complexity, which we can't "help" it avoid. The reason is that the universe contains many environmental niches that are essentially duplicates of each other, leading to convergent evolution. For example Earth contains lots of species that are similar to each other, and within each species there's huge amounts of redundancy. Evolution creates complexity, but not remotely close to maximum complexity. Imagine if each individual plant/animal had a radically different design, which would be possible if they weren't constrained by "survival of the fittest".

Whereas the entire purpose of FAI is to trap the universe in a local maximum.

Huh? The purpose of FAI is to achieve the global maximum of whatever utility function we give it. If that utility function contains a term for "complexity", which seems plausible given people like you and davidad (and even I'd probably prefer greater complexity to less, all else being equal), then it ought to at least get somewhat close to the global complexity maximum (since the constraint of simultaneously trying to maximize other values doesn't seem too burdensome, unless there are people who actively disvalue complexity).

Comment author: [deleted] 24 May 2012 05:47:45AM *  2 points [-]

The reason is that the universe contains many environmental niches that are essentially duplicates of each other, leading to convergent evolution. For example Earth contains lots of species that are similar to each other, and within each species there's huge amounts of redundancy.

There's often a deceptive amount of difference, some of it very fundamental, hiding inside those convergent similarities, and that's because "convergent evolution" is in the eye of the beholder, and mostly restricted to surface-level analogies between some basic functions.

Consider pangolins and echidnas. Pretty much the same, right? Oh sure, one's built on a placental framework and the other a monotreme one, but they've developed the same basic tools: long tongues, powerful digging claws, keratenous spines/sharp plates... not much scope for variance there, at least not of a sort that'd interest a lay person, surely.

Well, actually they're quite different. It's not just that echidnas lay eggs and pangolins birth live young, or that pangolins tend to climb trees and echidnas tend to burrow. Echidnas have more going on upstairs, so to speak -- their brains are about 50% neocortex (compare 30% for a human) and they are notoriously clever. Among people who work with wild populations they're known for being basically impossible to trap, even when appropriate bait can be set up. In at least one case a researcher who'd captured several (you essentially have to grab them when you find them) left them in a cage they couldn't dig out of, only to find in the morning they'd stacked up their water dishes and climbed out the top. There is evidence that they communicate infrasonically in a manner similar to elephants, and they are known to be sensitive to electricity.

My point here isn't "Echidnas are awesome!", my point is that the richness of behavior and intelligence that they display is not mirrored in pangolins, who share the same niche and many convergent adaptations. To a person with no more than a passing familiarity, they'd be hard to distinguish on a functional level since their most obvious, surface-visible traits are very similar and the differences seem minor. If you get an in-depth look at them, they're quite different, and the significance of those "convergent" traits diminishes in the face of much more salient differences between the two groups of animals.

Short version: superficial similarities are very often only that, especially in the world of biology. Often they do have some inferential value, but there are limits on that.

Comment author: PhilGoetz 20 May 2012 02:45:53AM *  0 points [-]

Evolution creates complexity, but not remotely close to maximum complexity. Imagine if each individual plant/animal had a radically different design, which would be possible if they weren't constrained by "survival of the fittest".

This is true; but I favor systems that can evolve, because they are evolutionarily stable. Systems that aren't, are likely to be unstable and vulnerable to collapse, and typically have the ethically undesirable property of punishing "virtuous behavior" within that system.

Huh? The purpose of FAI is to achieve the global maximum of whatever utility function we give it.

True. I spoke imprecisely. Life is increasing in complexity, in a meaningful way that is not the same as the negative of entropy, and which I feel comfortable calling "progress" despite Stephen J. Gould's strident imposition of his sociological agenda onto biology. This is the thing I'm talking about maximizing. Whatever utility function an FAI is given, it's only going to involve concepts that we already have, which represent a small fraction of possible concepts; and so it's not going to keep increasing as much in that way.

Comment author: DanArmak 19 May 2012 03:25:15PM *  3 points [-]

The best guide I have to what future life-forms will be like compared to me, if allowed to evolve naturally, is to consider what I am like compared to a fruit fly, or to bacteria.

This is true but not relevant. It suggests that future life forms will be much more complex, intelligent, powerful in changing the physical universe on many scales, good at out-competing (or predating on) other species to the point of driving them to extinction. You might also add differences between yourself and flies (and bacteria) like "future life forms will be a lot bigger and longer-lived", or you might consider those incidental because you don't value them as much.

But none of that implies anything about the future life-forms' values, except that they will be selfish to the exclusion of other species which are not useful or beautiful to them, so that old-style humans will be endangered. It doesn't imply anything that would cause me to expect to value these future species more than I value today's nonhuman species, let alone today's humans.

If you object that of course I will value myself more highly than I value a bacterium, and that I fail to adequately respect bacterial values, I can compare an algae to an oak tree. The algae is more closely-related to me; yet I still consider the oak tree a grander life form, and would rather see a world with algae and oak trees than one with only algae.

So you value other life-forms proportionally to how similar they are to you, and an important component of that is some measure of compexity, plus your sense of aesthetics (grandeur). You don't value evolutionary relatedness highly. I feel the same way (I value a cat much more than a bat (edit: or rat)), but so what? I don't see how this logically implies that new lifeforms that will exist in the future, and their new values, are more likely than not to be valued by us (if we live long enough to see them).

It's also possible that life does not naturally progress indefinitely

Life may keep changing indefinitely, barring a total extinction. But that constant change isn't "progress" by any fixed set of values because evolution has no long term goal.

Apart from the nonexistence of humans, who are unique in their intelligence/self-consciousness/tool-use/etc., life on Earth was apparently just as diverse and grand and beautiful hundreds of millions of years ago as it is today. There's been a lot of change, but no progress in terms of complexity before the very quick evolution of humans. If I were to choose between this world, and a world with humans but otherwise the species of 10, 100, or 300 millions of years ago, I don't feel that today's bio-sphere is somehow better. So I don't feel a hypothetical biosphere of 300 million years in the future would likely be better than today's on my existing values. And I don't understand why you do.

If you like, I can say that I value complexity

Do you really value complexity for its own sake? Or do you value it for the sake of the outcomes (such as intelligence) which it helps produce?

If you are offered prosthetic arms that look and feel just like human ones but work much better in many respects, you might accept them or not, but I doubt the ground for your objection would be that the biological version is much more complex.

build an FAI that maximizes some complexity measure.

Could you explain what kind of complexity measure you have in mind?. For instance, info-theoretical complexity (~ entropy) is maximized by a black hole, and is greatly increased just by a good random number generator. Surely that's not what you mean.

Comment author: army1987 19 May 2012 03:41:45PM *  3 points [-]

(I value a cat much more than a bat)

<nitpick level="extreme">Bats are no longer thought to be that closely related to us. In particular, cats and bats are both Laurasiathera, whereas we are Euarchontoglires. On the other hand, mice are Euarchontoglires too.</nitpick>

Apart from the nonexistence of humans, who are unique in their intelligence/self-consciousness/tool-use/etc., life on Earth was apparently just as diverse and grand and beautiful hundreds of millions of years ago as it is today.

<nitpick level="even more extreme">You might want to reduce that number by an order of magnitude. See http://en.wikipedia.org/wiki/Timeline_of_evolutionary_history_of_life </nitpick>

Comment author: DanArmak 19 May 2012 03:54:38PM 0 points [-]

Bats are no longer thought to be that closely related to us.

Thanks! I appreciate this updating of my trivial knowledge.

Will change to: I value a cat much more than a rat.

You might want to reduce that number by an order of magnitude.

I meant times as old as, say, 200-300 Mya. The End-Permian extinction sits rather unfortunately right in the middle of that, but I think both before it and after sufficient recovery (say 200 Mya) there was plenty of diversity of beauty around.

No cats, though.

Comment author: army1987 19 May 2012 05:14:18PM 0 points [-]

Will change to: I value a cat much more than a rat.

Yeah, it hadn't occurred to me to try and preserve the rhyme! :-)

Comment author: DanArmak 19 May 2012 05:22:17PM 1 point [-]

Is there a blog or other net news source you'd recommend for learning about changes like "we're no longer closely related to bats, we're really something-something-glires"? They seem to be coming more and more frequently lately.

Comment author: army1987 19 May 2012 09:43:15PM *  1 point [-]

I just browse aimlessly around Wikipedia when I'm bored, and a couple months ago I ended up reading about the taxonomy of pretty much any major vertebrate group. (I've also stumbled upon http://3lbmonkeybrain.blogspot.it/, but it doesn't seem to be updated terribly often these days.)

Comment author: PhilGoetz 20 May 2012 02:51:01AM *  0 points [-]

I don't think you're getting what I'm saying. Let me state it in FAI-type terms:

I have already figured out my values precisely enough to implement my own preferred FAI: I want evolution to continue. If we put that value into an FAI, then, okay.

But the lines that people always try to think along are instead to enumerate values like "happiness", "love", "physical pleasure", and so forth.

Building an FAI to maximize values defined at that level of abstraction would be a disaster. Building an FAI to maximize values at the higher level of abstraction would be kind of pointless, since the universe is already doing that anyway, and our FAI is more likely to screw it up than to save it.

Could you explain what kind of complexity measure you have in mind?. For instance, info-theoretical complexity (~ entropy) is maximized by a black hole, and is greatly increased just by a good random number generator. Surely that's not what you mean.

People have dealt with this enough that I don't think you're really objecting that what I'm saying is unclear; you're objecting that I don't have a mathematical definition of it. True. But pointing to evolution as an example suffices to show that I'm talking about something sensible and real. Evolution increases some measure of complexity, and not randomness.

Comment author: TheOtherDave 20 May 2012 04:35:11AM 2 points [-]

I have already figured out my values precisely enough to implement my own preferred FAI: I want evolution to continue. If we put that value into an FAI, then, okay.

So, I kind of infer from what you've said elsewhere that you don't equally endorse all possible evolutions equally. That is, when you say "evolution continues" you mean something rather more specific than that... continuing in a particular direction, leading to greater and greater amounts of whatever-it-is-that-evolution-currently-optimizes-for (this "complexity measure" cited above), rather than greater and greater amounts of anything else.

And I kind of infer that the reason you prefer that is because it has historically done better at producing results you endorse than any human-engineered process has or could reasonably be expected to have, and you see no reason to expect that state to change; therefore you expect that for the foreseeable future the process of evolution will continue to produce results that you endorse, or at least that you would endorse, or at the very least that you ought to endorse.

Did I get that right?

Are you actually saying that simpler systems don't ever evolve from more complex ones? Or merely that when that happens, the evolutionary process that led to it isn't the kind of evolutionary process you're endorsing here? Or something else?

Comment author: PhilGoetz 22 May 2012 03:35:27AM 0 points [-]

So, I kind of infer from what you've said elsewhere that you don't equally endorse all possible evolutions equally. That is, when you say "evolution continues" you mean something rather more specific than that... continuing in a particular direction, leading to greater and greater amounts of whatever-it-is-that-evolution-currently-optimizes-for (this "complexity measure" cited above), rather than greater and greater amounts of anything else.

I don't understand your distinction between "all possible evolutions" and "whatever-it-is-that-evolution-currently-optimizes-for". There are possible courses of evolution that I don't think I would like, such as universes in which intelligence were eliminated. When thinking about how to optimize the future, I think of probability distributions.

And I kind of infer that the reason you prefer that is because it has historically done better at producing results you endorse than any human-engineered process has or could reasonably be expected to have, and you see no reason to expect that state to change; therefore you expect that for the foreseeable future the process of evolution will continue to produce results that you endorse, or at least that you would endorse, or at the very least that you ought to endorse.

Yes! Though I would say, "it has historically done better at producing results I endorse, starting from point X, than any process engineered by organisms existing at point X could reasonably be expected to have."

Are you actually saying that simpler systems don't ever evolve from more complex ones?

No. It happens all the time. The simplest systems, viruses and mycoplasmas, can exist only when embedded in more complex systems - although maybe they don't count as systems for that reason. OTOH, there must have been life forms even simpler at one time, and we see no evidence of them now. For some reason the lower bound on possible life complexity has increased over time - possibly just once, a long time ago.

Or merely that when that happens, the evolutionary process that led to it isn't the kind of evolutionary process you're endorsing here? Or something else?

Two "something else" options are (A) merely widening the distribution, without increasing average complexity, would be more interesting to me, and (B) simple organisms appear to be necessary parts of a complex ecosystem, perhaps like simple components are necessary parts of a complex machine.

Comment author: TheOtherDave 22 May 2012 03:48:34AM *  1 point [-]

I think I see... so it's not the complexity of individual organisms that you value, necessarily, but rather the overall complexity of the biosphere? That is, if system A grows simpler over time and system B grows more complex, it's not that you value the process that leads to B but not the process that leads to A, but rather that you value the process that leads to (A and B). Yes?

Edit: er, I got my As and Bs reversed. Fixed.

Comment author: Ghatanathoah 30 May 2012 04:05:46PM -1 points [-]

I have a few more objections I didn't cover in my last comment because I hadn't thoroughly thought them out yet.

Those of you who are operating under the assumption that we are maximizing a utility function with evolved terminal goals, should I think admit these terminal goals all involve either ourselves, or our genes.

No, these terminal goals can also involve other people and the state of the world, even if they are evolved. There are several reasons human consciousnesses might have evolved goals that do not involve themselves or their genes. The most obvious one is that an entity that only values itself and its genes only is far less trustworthy than one that values other people as ends in themselves, and hence would have difficult getting other entities to engage in positive sum games with it. Evolving to value other people makes it possible for other people who might prove useful trading partners to trust the agent in question, since they know it won't betray them the instant they have outlived their usefulness.

Another obvious one is kin selection. Evolution metaphorically "wants" us to value our relatives since they share some of our genes. But rather than waste energy developing some complex adaptation to determine how many genes you share with someone, it took a simpler route and just made us value people we grew up around.

And no, the fact that I know my altruism and love for others evolved for game theoretic reasons does not make it any less wonderful and any less morally right.

If they involve our genes, they they are goals that our bodies are pursuing, that we call errors, not goals, when we the conscious agent inside our bodies evaluate them.

Again, it is quite possible for a conscious agent to value things other than itself, but not value the goals of evolution or its genes. There are many errors that our bodies make that occur because they involve our genes, not our real goals. But valuing other people and the future is not one of them, it is an intrinsic part of the makeup of the conscious agent part.

Averaging value systems is worse than choosing one.... The point is that the CEV plan of "averaging together" human values will result in a set of values that is worse (more self-contradictory) than any of the value systems it was derived from.

Alan Carter, who is rapidly becoming my favorite living philosopher, explains here how it is quite possible to have a pluralistic metaethics without being incoherent. His main argument is that as long as you hold values to be incremental rather than absolute, it is possible to trade one off against the other without being incoherent.