CCC comments on Wanting to Want - Less Wrong

16 Post author: Alicorn 16 May 2009 03:08AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (185)

You are viewing a single comment's thread. Show more comments above.

Comment author: CCC 29 October 2012 10:14:23AM *  1 point [-]

That seems like a reasonable definition; my point is that not everyone uses the same equation.

That's true, the question is, how often is this because people have totally different values, and how often is it that they have extremely similar "ideal equations," but different "approximations" of what they think that equation is. I think for sociopaths, and other people with harmful ego-syntonic mental disorders it's probably the former, but its more often the later for normal people.

I'd say sometimes A, and sometimes B. But I think that's true even in the absence of mental disorders; I don't think that the "ideal equation" necessarily sits somewhere hidden in the human psyche.

It sounds to me like Fred and Marvin both care about achieving similar moral objectives, but have different ideas about how to go about it. I'd say that again, which moral code is better can only be determined by trying to figure out which one actually does a better job of achieving moral goals. "Moral progress" can be regarded as finding better and better heuristics to achieve those moral goals, and finding a closer representation of the ideal equation.

That is valid, as long as both systems have the same goals. Marvin's system includes the explicit goal "stay alive", more heavily weighted then the goal "keep a stranger alive"; Fred's system explicitly entirely excludes the goal "stay alive".

If two moral systems agree both on the goals to be achieved, and the weightings to give those goals, then they will be the same moral system, yes. But two people's moral systems need not agree on the underlying goals.

Again, I think I agree with Eliezer that a truly alien code of behavior, like that exhibited by sociopaths, and really inhuman aliens like the Pebblesorters or paperclippers, should maybe be referred to by some word other than morality. This is because since the word "morality" usually refers to doing things like making the world a happier place and increasing the positive things in life.

Well, to be fair, in a Paperclipper's mind, paperclips are the positive things in life, and they certainly make the paperclipper happier. I realise that's probably not what you intended, but the phrasing may need work.

Which really feeds into the question of what goals a moral system should have. To the Babyeaters, a moral system should have the goal of eating babies, and they can provide a lot of argument to support that point - in terms of improved evolutionary fitness, for example.

I think that we can agree that a moral system's goals should be the good things in life. I'm less certain that we can agree on what those good things necessarily are, or on how they should be combined relative to each other. (I expect that if we really go to the point of thoroughly dissecting what we consider to be the good things in life, then we'll agree more than we disagree; I expect we'll be over 95% in agreement, but not quite 100%. This is what I generally expect for any stranger).

For example, we might disagree on whether it is more important to be independant in our actions, or to follow the legitimate instructions of a suitably legitimate authority.

Sorry, the only Cherryh I've read is "The Scapegoat." I thought it gave a good impression of how alien values would look to humans, but wish it had given some more ideas about what it was that made elves think so differently.

Hmmm. I haven't read that one.

Comment author: Ghatanathoah 30 October 2012 01:14:39AM *  0 points [-]

I'd say sometimes A, and sometimes B. But I think that's true even in the absence of mental disorders; I don't think that the "ideal equation" necessarily sits somewhere hidden in the human psyche.

It's not that I think there's literally a math equation locked in the human psyche that encodes morality. It's more like there are multiple (sometimes conflicting) moral values and methods for resolving conflicts between them and that the sum of these can be modeled as a large and complicated equation.

That is valid, as long as both systems have the same goals. Marvin's system includes the explicit goal "stay alive", more heavily weighted then the goal "keep a stranger alive"; Fred's system explicitly entirely excludes the goal "stay alive".

You gave me the impression that Marvin valued "staying alive" less as an end in itself, and more as a means to achieve the end of improving the world. in particular when you said this:

Marvin's moral system considers the total benefit to the world of every action; but he tends to weight actions in favour of himself, because he knows that in the future, he will always choose to do the right thing (by his morality) and thus deserves ties broken in his favour.

This is actually something that bothers me in fiction when a character who is superhumanly good and power (i.e. Superman, the Doctor) risks their lives to save a relatively small amount of people. It seems short-sighted of them to do that since they regularly save much larger groups of people and anticipate continuing to do so in the future, so it seems like they should preserve their lives for those people's sakes.

Well, to be fair, in a Paperclipper's mind, paperclips are the positive things in life, and they certainly make the paperclipper happier. I get the impression that the paperclipper doesn't feel happiness, just a raw motivation to increase the amount of paperclips.

I think that we can agree that a moral system's goals should be the good things in life. I'm less certain that we can agree on what those good things necessarily are, or on how they should be combined relative to each other.

If you define "the good things in life" as "whatever an entity wants the most," then you can agree, whatever someone wants is "good," be it paperclips or eudaemonia. On the other hand, I'm not sure we should do this, there are some hypothetical entities I can imagine where I can't see it as ever being good that they get what they want. For instance I can imagine a Human-Torture-Maximizer that wants to do nothing but torture human beings. It seems to me that even if there were a trillion Human-Torture-Maximizers and one human in the universe it would be bad for them to get what they want.

For more neutral, but still alien preferences, I'm less sure. It seems to me that I have a right to stop Human-Torture-Maximizers from getting what they want. But would I have the right to stop paperclippers? Making the same paperclip over and over again seems like a pointless activity to me, but if the paperclippers are willing to share part of the universe with existing humans do I have a right to stop them? I don't know, and I don't think Eliezer does either.

(I expect that if we really go to the point of thoroughly dissecting what we consider to be the good things in life, then we'll agree more than we disagree; I expect we'll be over 95% in agreement, but not quite 100%. This is what I generally expect for any stranger).

I think that we, and most humans, have the same basic desires, where we differ is the object of those desires, and the priority of those desires.

For instance, most people desire romantic love. But those desires usually have different objects, I desire romantic love with my girlfriend, other people desire it with their significant others. Similarly, most people desire to consume stories, but the object of that desire differs, some people like Transformers, others The Notebook.

Similarly, people often desire the same things, but differ as to their priorities, how much of those things they want. Most people desire both socializing, and quiet solitude, but some extroverts want lots of one and less of the other, while introverts are the opposite.

In the case of the paerclippers, my first instinct is to regard opposing paperclipping as no different from the many ways humans have persecuted each other for wanting different things in the past. But then it occurred to me that paperclip-maximizing might be different because most persecutions in the past involve persecuting people who have different objects and priorities, not people who actually have different desires. For instance homosexuality is the same kind of desire as heterosexuality, just with a different object (same sex instead of opposite).

Does this mean it isn't bad to oppose paperclipping? I don't know, maybe, but maybe not. Maybe we should just try to avoid creating paperclippers or similar creatures so we don't have to deal with it.

For example, we might disagree on whether it is more important to be independant in our actions, or to follow the legitimate instructions of a suitably legitimate authority.

This seems like a difference in priority, rather than desire, as most people would prefer differing proportions of both both. It's still a legitimate disagreement, but I think it's more about finding a compromise between conflicting priorities, rather than totally different values.

Compounding this problem is the fact that people value diversity to some extent. We don't value all types of diversity obviously, I think we'd all like to live in a world where people held unanimous views on the unacceptability of torturing innocent people. But we would like other people to be different from us in some ways. Most people, I think, would rather live in a world full of different people with different personalities than a world consisting entirely of exact duplicates (in both personality and memory) of one person. So it might be impossible to reach full agreement on those other values without screwing up the achievement of the Value of Diversity.

Comment author: CCC 31 October 2012 08:29:39AM *  0 points [-]

It's not that I think there's literally a math equation locked in the human psyche that encodes morality. It's more like there are multiple (sometimes conflicting) moral values and methods for resolving conflicts between them and that the sum of these can be modeled as a large and complicated equation.

I'm sorry, there's an ambiguity there - when you say "the sum of these", are you summing across the moral values and imperatives of a single person, or of humanity as a whole?

You gave me the impression that Marvin valued "staying alive" less as an end in itself, and more as a means to achieve the end of improving the world. in particular when you said this:

You are quite correct. I apologise; I changed that example several times from where I started, and it seems that one of my last-minute changes actually made it a worse example (my aim was to try to show how the explicit aim of self-preservation could be a reasonable moral aim, but in the process I made it not a moral aim at all). I should watch out for that in the future.

This is actually something that bothers me in fiction when a character who is superhumanly good and power (i.e. Superman, the Doctor) risks their lives to save a relatively small amount of people. It seems short-sighted of them to do that since they regularly save much larger groups of people and anticipate continuing to do so in the future, so it seems like they should preserve their lives for those people's sakes.

I've always felt that was because one of the effects of great power, is that it's so very easy to let everyone die. With great power, as Spiderman is told, comes great responsibility; one way to ensure that you're not letting your own power go to your head, is by refusing to not-rescue anyone. After all, if the average science hero lets everyone he thinks is an idiot die, then who would be left?

Sometimes there's a different reason, though; Sherlock Holmes would ignore a straightforward and safe case to catch a serial killer in order to concentrate on a tricky and deadly case involving a stolen diamond; he wasn't in the detective business to help people, he was in the detective business in order to be challenged, and he would regularly refuse to take cases that did not challenge him.

(That's probably a fair example as well, actually; for Holmes, only the challenge, the mental stimulation of a worthy foe, is important; for Superman, what is important is the saving of lives, whether from a mindless tsunami or Lex Luthor's latest plot).

I think that we, and most humans, have the same basic desires, where we differ is the object of those desires, and the priority of those desires.

Hmmm. If you're willing to accept zero, or near-zero, as a priority, then that statement can apply to any two sets of desires. Consider Sherlock holmes and a paperclipper; Holmes' desire for mental stimulation is high-priority, his desire for paperclips is zero-priority, while the paperclipper's desire for paperclips is high priority, and its desire for mental stimulation is zero-priority. (Some desires may have negative priority, which can then be interpreted as a priority to avoid that outcome - for example, my desire to immerse my hand in acid is negative, but a masochist may have a positive priority for that desire)

This implies that, in order to meaningfully differentiate the above statement from "some people have different desires", I may have to designate some very low priority, below which the desire is considered absent (I may, of course, place that line at exactly zero priority). Some desires, however, may have no priority on their own, but inherit priority from another desire that they feed into; for example, a paperclipper has zero desire for self-preservation on its own, but it will desire self-preservation so that it can better create more paperclips.

Now, given a pool of potential goals, most people will pick out several desires from that pool, and there will be a large overlap between any two people (for example, most humans desire to eat - most but not all, there are certain eating disorders that can mess with that), and it is possible to pick out a set of desires that most people will have high priorities for.

It's even probably possible to pick out a (smaller) set of desires such that those who do not have those desires at some positive priority are considered psychologically unhealthy. But such people nonetheless do exist.

Does this mean it isn't bad to oppose paperclipping? I don't know, maybe, but maybe not.

In my personal view, it is neutral to paperclip or to oppose paperclipping. It becomes bad to paperclip only when the paperclipping takes resources away from something more important.

And there are circumstances (somewhat forced circumstances) where it could be good to paperclip.

For example, we might disagree on whether it is more important to be independant in our actions, or to follow the legitimate instructions of a suitably legitimate authority.

This seems like a difference in priority, rather than desire, as most people would prefer differing proportions of both both. It's still a legitimate disagreement, but I think it's more about finding a compromise between conflicting priorities, rather than totally different values.

There exist people who would place negative value on the idea of following the instuctions of any legitimate authority. (They tend to remain a small and marginal group, because they cannot in turn form an authority for followers to follow without rampant hypocrisy).

Compounding this problem is the fact that people value diversity to some extent. We don't value all types of diversity obviously, I think we'd all like to live in a world where people held unanimous views on the unacceptability of torturing innocent people. But we would like other people to be different from us in some ways. Most people, I think, would rather live in a world full of different people with different personalities than a world consisting entirely of exact duplicates (in both personality and memory) of one person. So it might be impossible to reach full agreement on those other values without screwing up the achievement of the Value of Diversity.

Yes, diversity has many benefits. The second-biggest benefit of diversity is that some people will be more correct than others, and this can be seen in the results they get; then everyone can re-diversify around the most correct group (a slow process, taking generations, as the most successful group slowly outcompetes the rest and thus passes their memes to a greater and/or more powerful proportion of the next generation). By a similar tokem, it means that one something happens that detroys one type of person, it doesn't destroy everyone (bananas have a definite problem there, being a bit of a monoculture).

The biggest benefit is that it leads to social interaction. A completely non-diverse society would have to be a hive mind (or different experiences would slowly begin to introduce diversity), and it would be a very lonely hive mind, with no-one to talk to.

Comment author: Ghatanathoah 31 October 2012 09:26:11AM *  -1 points [-]

I'm sorry, there's an ambiguity there - when you say "the sum of these", are you summing across the moral values and imperatives of a single person, or of humanity as a whole?

Nearly all of humanity as a whole. There are obviously some humans who don't really value morality, we call them sociopaths, but I think most humans care about very similar moral concepts. The fact that people have somewhat different personal preferences and desires at first might seem to challenge this idea, but I don't really think it does. It just means that there are some desires that generate the same "value" of "good" when fed into the "equation." In fact, if diversity is a good, as we discussed previously, then people having different personal preferences might in fact be morally desirable.

Hmmm. If you're willing to accept zero, or near-zero, as a priority, then that statement can apply to any two sets of desires......This implies that, in order to meaningfully differentiate the above statement from "some people have different desires", I may have to designate some very low priority, below which the desire is considered absent

That's a good point. I was considering using the word "proportionality" instead of "priority" to better delineate that I don't accept zero as a priority, but rejected it because it sounded clunky. Maybe I shouldn't have.

In my personal view, it is neutral to paperclip or to oppose paperclipping. It becomes bad to paperclip only when the paperclipping takes resources away from something more important.

I agree with that. What I'm wondering is, would I have a moral duty to share resources with a paperclipper if it existed, or would pretty much any of the things I spend the resources on if I kept them for myself count (i.e. eudaemonic things) as "something more important."

There exist people who would place negative value on the idea of following the instuctions of any legitimate authority.

I think there might actually be lots of people like this, but most appear normal because they place even greater negative value on doing something stupid because they ignored good advice just because it came from an authority. In other words, following authority is a negative terminal value, but an extremely positive instrumental value.

The biggest benefit is that it leads to social interaction. A completely non-diverse society would have to be a hive mind (or different experiences would slowly begin to introduce diversity), and it would be a very lonely hive mind, with no-one to talk to.

Exactly. I would still want the world to be full of a diverse variety of people, even if I had a nonsentient AI that was right about everything and could serve my every bodily need.

Comment author: CCC 31 October 2012 10:08:17AM 0 points [-]

I'm sorry, there's an ambiguity there - when you say "the sum of these", are you summing across the moral values and imperatives of a single person, or of humanity as a whole?

Nearly all of humanity as a whole. There are obviously some humans who don't really value morality, we call them sociopaths, but I think most humans care about very similar moral concepts.

Okay then, next question; how do you decide which people to exclude? You say that you are excluding sociopaths, and I think that they should be excluded; but on exactly what basis? If you're excluding them simply because they fail to have the same moral imperatives as the ones that you think are important, then that sounds very much like a No True Scotsman argument to me. (I exclude them mainly on an argument of appeal to authority, myself, but that also has logic problems; in either case, it's a matter of first sketching out what the moral imperative should be, then throwing out the people who don't match).

And for a follow-up question; is it necessary to limit it to humanity? Let us assume that, ten years from now, a flying saucer lands in the middle of Durban, and we meet a sentient alien form of life. Would it be necessary to include their moral preferences in the equation as well?

Even if they are Pebblesorters?

In fact, if diversity is a good, as we discussed previously, then people having different personal preferences might in fact be morally desirable.

It may be, but only within a limited range. A serial killer is well outside that range, even if he believes that he is doing good by only killing "evil" people (for some definition of "evil").

What I'm wondering is, would I have a moral duty to share resources with a paperclipper if it existed, or would pretty much any of the things I spend the resources on if I kept them for myself count (i.e. eudaemonic things) as "something more important."

Hmmm. I think I'd put "buying a packet of paperclips for the paperclipper" as on the same moral footing, more or less, as "buying an icecream for a small child". It's nice for the person (or paperclipper) recieving the gift, and that makes it a minor moral positive by increasing happiness by a tiny fraction. But if you could otherwise spend that money on something that would save a life, then that clearly takes priority.

I think there might actually be lots of people like this, but most appear normal because they place even greater negative value on doing something stupid because they ignored good advice just because it came from an authority. In other words, following authority is a negative terminal value, but an extremely positive instrumental value.

Hmmm. Good point; that is quite possible. (Given how many people seem to follow any reasonably persuasive authority, though, I suspect that most people have a positive priority for this goal - this is probably because, for a lot of human history, peasants who disagreed with the aristocracy tended to have fewer descendants unless they all disagreed and wiped out said aristocracy).

Exactly. I would still want the world to be full of a diverse variety of people, even if I had a nonsentient AI that was right about everything and could serve my every bodily need.

Here's a tricky question - what exactly are the limits of "nonsentient"? Can a nonsentient AI fake it by, with clever use of holograms and/or humanoid robots, cause you to think that you are surrounded by a diverse variety of people even when you are not (thus supplying the non-bodily need of social interaction)? The robots would all be philosophical zombies, of course; but is there any way to tell?

Comment author: Ghatanathoah 31 October 2012 11:02:54AM *  1 point [-]

Okay then, next question; how do you decide which people to exclude?

I don't think I'm coming across right. I'm not saying that morality is some sort of collective agreement of people in regards to their various preferences. I'm saying that morality is a series of concepts such as fairness, happiness, freedom etc., that these concepts are objective in the sense that it can be objectively determined how much fairness, freedom, happiness etc. there is in the world, and that the sum of these concepts can be expressed as a large equation.

People vary in their preference for morality, most people care about fairness, freedom, happiness, etc. to some extent. But there are some people who don't care about morality at all, such as sociopaths.

Morality isn't a preference. It isn't the part of a person's brain that says "This society is fair and free and happy, therefore I prefer it." Morality is those disembodied concepts of freedom, fairness, happiness, etc. So if a person doesn't care about those things, it doesn't mean that freedom, fairness, happiness, etc. aren't part of their morality. It means that person doesn't care about morality, they care about something else."

To use the Pebblesorter analogy again, the fact that you and I don't care about sorting pebbles into prime-numbered heaps isn't because we have our own concept of "primeness" that doesn't include 2, 3, 5 and 7. It just means we don't care about primeness.

To make another analogy, if most people preferred wearing wool clothes but one person preferred cotton, that wouldn't mean that that person had their own version of wool, which was cotton. It means that that person doesn't prefer wool.

Look inward, and consider why you think most people should be included. Presumably it's because you really care a lot about being fair. But that necessarily means that you cared about fairness before you even considered what other people might think. Otherwise it wouldn't have even occurred to you to think about what they preferred in the first place.

The fact that most humans care, to some extent, about the various facets of morality, is a very lucky thing, a planet full of sociopaths would be most unpleasant. But it isn't relevant to the truth of morality. You'd still think torturing people was bad if all the non-sociopaths on Earth except you were killed, wouldn't you? If, in that devastated world, you came across a sociopath torturing another sociopath or an animal, and could stop them at no risk to yourself, you'd do it, wouldn't you?

You say that you are excluding sociopaths, and I think that they should be excluded; but on exactly what basis?

I suspect that your intuition comes from the fact that a central part of morality is fairness, and sociopaths don't care about fairness. Obviously being fair to the unfair is as unwise as tolerating the intolerant.

And for a follow-up question; is it necessary to limit it to humanity? Let us assume that, ten years from now, a flying saucer lands in the middle of Durban, and we meet a sentient alien form of life. Would it be necessary to include their moral preferences in the equation as well?

Again, I want to emphasize that morality isn't the "preference" part, it's the "concepts" part. But the question of the moral significance of aliens is relevant, I think it would depend on how many of the concepts that make up morality they cared about. I think that at a bare minimum they'd need fairness and sympathy.

So if the Pebblesorters that came out of that ship were horrified that we didn't care about primality, but were willing to be fair and share the universe with us, they'd be a morally worthwhile species. But if they had no preference for fairness or any sympathy at all, and would gladly kill a billion humans to sort a few more pebbles, that would be a different story. In that case we should probably, after satisfying ourselves that all Pebblesorters were psychologically similar, start prepping a Relativistic Kill Vehicle to point at their planet if they try something.

Here's a tricky question - what exactly are the limits of "nonsentient"? Can a nonsentient AI fake it by, with clever use of holograms and/or humanoid robots, cause you to think that you are surrounded by a diverse variety of people even when you are not (thus supplying the non-bodily need of social interaction)? The robots would all be philosophical zombies, of course; but is there any way to tell?

I don't know if I could tell, but I'd very much prefer that the AI not do that, and would consider myself to have been massively harmed if it did, even if I never found out. My preference is to actually interact with a diverse variety of people, not to merely have a series of experiences that seem like I'm doing it.

Comment author: CCC 01 November 2012 07:33:30AM *  0 points [-]

I don't think I'm coming across right. I'm not saying that morality is some sort of collective agreement of people in regards to their various preferences. I'm saying that morality is a series of concepts such as fairness, happiness, freedom etc., that these concepts are objective in the sense that it can be objectively determined how much fairness, freedom, happiness etc. there is in the world, and that the sum of these concepts can be expressed as a large equation.

Ah, I think I see your point. What you're saying - and correct me if I'm wrong - is that there is some objective True Morality, some complex equation that, if applied to any possible situation, will tell you how moral a given act is.

This is probably true.

This equation isn't written into the human psyche; it exists independantly of what people think about morality. It just is. And even if we don't know exactly what the equation is, even if we can't work out the morality of a given act down to the tenth decimal place, we can still apply basic heuristics and arrive at a usable estimate in most situations.

My question is, then - assuming the above is true, how do we find that equation? Does there exist some objective method whereby you, I, a Pebblesorter, and a Paperclipper can all independently arrive at the same definition for what is moral (given that the Pebblesorter and Paperclipper will almost certainly promptly ignore the result)?

(I had thought that you were proposing that we find that equation by summing across the moral values and imperatives of humanity as a whole - excluding the psychopaths. This is why I asked about the exclusion, because it sounded a lot like writing down what you wanted at the end of the page and then going back and discarding the steps that wouldn't lead there; that is also why I asked about the aliens).

I don't know if I could tell, but I'd very much prefer that the AI not do that, and would consider myself to have been massively harmed if it did, even if I never found out. My preference is to actually interact with a diverse variety of people, not to merely have a series of experiences that seem like I'm doing it.

Yes, I think we're in agreement on that. (Though this does suggest that 'sentient' may need a proper definition at some point).

Comment author: nshepperd 01 November 2012 09:35:41AM 1 point [-]

What you're saying - and correct me if I'm wrong - is that there is some objective True Morality, some complex equation that, if applied to any possible situation, will tell you how moral a given act is.

In the same way as there exists a True Set of Prime Numbers, and True Measure of How Many Paperclips There Are...

Comment author: Ghatanathoah 01 November 2012 08:56:18AM *  -1 points [-]

My question is, then - assuming the above is true, how do we find that equation?

Even though the equation exists independently of our thoughts (the same way primality exists independently from Pebblesorter thoughts) fact that we are capable of caring about the results given by the equation means we must have some parts of it "written" in our heads, the same way Pebblesorters must have some concept of primality "written" in their heads. Otherwise, how would we be capable of caring about its results?

I think that probably evolution metaphorically "wrote" a desire to care about the equation in our heads because if humans care about what is good and right it makes it easier for them to cooperate and trust each other, which has obvious fitness advantages. Of course, the fact that evolution did a good thing by causing us to care about morality doesn't mean that evolution is always good, or that evolutionary fitness is a moral justification for anything. Evolution is an amoral force causes many horrible things to happen. It just happened that in this particular instance, evolution's amoral metaphorical "desires" happened to coincide with what was morally good. That coincidence is far from the norm, in fact, evolution probably deleted morality from the brains of sociopaths because double-crossing morally good people also sometimes confers a fitness advantage.

So how do we learn more about this moral equation that we care about? One common form of attempting to get approximations of it in philosophy is called reflective equilibrium, where you take your moral imperatives and heuristics and attempt to find the commonalities and consistencies they have with each other. It's far from perfect, but I think that this method has produced useful results in the past.

Eliezer has proposed what is essentially a souped up version of reflective equilibrium called Coherent Extrapolated Volition. He has argued, however, that the primary use of CEV is in designing AIs that won't want to kill us, and that attempting to extrapolate other people's volition is open to corruption, as we could easily fall to the temptation to extrapolate it to something that personally benefits us.

Does there exist some objective method whereby you, I, a Pebblesorter, and a Paperclipper can all independently arrive at the same definition for what is moral (given that the Pebblesorter and Paperclipper will almost certainly promptly ignore the result)?

Again, we could probably get closer through reflective equilibrium, and by critiquing the methods and results of each other's reflections. If you somehow managed to get a Pebblesorter or a Paperclipper to do it too, they might generate similar results, although since they don't intrinsically care about the equation you would probably have to give them some basic instructions before they started working on the problem.

I had thought that you were proposing that we find that equation by summing across the moral values and imperatives of humanity as a whole - excluding the psychopaths.

If we assume that most humans care about acting morally, doing research about what people's moral imperatives are might be somewhat helpful, since it would allow us to harvest the fruits of other people's moral reflections and compare them with our own. We can exclude sociopaths because there is ample evidence that they care nothing for morality.

Although I suppose, that a super-genius sociopath who had the basic concept explained to them might be able to do some useful work in the same fashion that a Pebblesorter or Paperclipper might be able to. Of course, the genius sociopath wouldn't care about the results, and probably would have to be paid a large sum to even agree to work on the problem.

Comment author: CCC 01 November 2012 02:14:17PM 0 points [-]

I think that probably evolution metaphorically "wrote" a desire to care about the equation in our heads because if humans care about what is good and right it makes it easier for them to cooperate and trust each other, which has obvious fitness advantages.

Hmmm. That which evolution has "written" into the human psyche could, in theory, and given sufficient research, be read out again (and will almost certainly not be constant across most of humanity, but will rather exist with variations). But I doubt that morality is all in out genetic nature; I suspect that most of it is learned, from our parents, aunts, uncles, grandparents and other older relatives; I think, in short, that morality is memetic rather than genetic. Though evolution still happens in memetic systems just as well as in genetic systems.

So how do we learn more about this moral equation that we care about? One common form of attempting to get approximations of it in philosophy is called reflective equilibrium, where you take your moral imperatives and heuristics and attempt to find the commonalities and consistencies they have with each other. It's far from perfect, but I think that this method has produced useful results in the past.

Hmmm. Looking at the wikipedia article, I can expect reflective equilibrium to produce a consistent moral framework. I also expect a correct moral framework to be consistent; but not all consistent moral frameworks are correct. (A paperclipper does not have what I'd consider a correct moral framework, but it does have a consistent one).

If you start out close to a correct moral framework, then reflective equilibrium can move you closer, but it doesnt necessarily do so.

Eliezer has proposed what is essentially a souped up version of reflective equilibrium called Coherent Extrapolated Volition. He has argued, however, that the primary use of CEV is in designing AIs that won't want to kill us, and that attempting to extrapolate other people's volition is open to corruption, as we could easily fall to the temptation to extrapolate it to something that personally benefits us.

Hmmm. The primary use of trying to find the True Morality Equation, to my mind, is to work it into a future AI. If we can find such an equation, prove it correct, and make an AI that maximises its output value, then that would be an optimally moral AI. This may or may not count as Friendly, but it's certainly a potential contender for the title of Friendly.

Again, we could probably get closer through reflective equilibrium, and by critiquing the methods and results of each other's reflections. If you somehow managed to get a Pebblesorter or a Paperclipper to do it too, they might generate similar results, although since they don't intrinsically care about the equation you would probably have to give them some basic instructions before they started working on the problem.

Carrying through this method to completion could give us - or anyone else - an equation. But is there any way to be sure that it necessarily gives us the correct equation? (A pebblesorter may actually be a very good help in resolving this question; he does not care about morality, and therefore does not have any emotional investment in the research).

The first thought that comes to my mind, is to have a very large group of researchers, divide them into N groups, and have each of these groups attempt, independently, to find an equation; if all of the groups find the same equation, this would be evidence that the equation found is correct (with stronger evidence at larger values of N). However, I anticipate that the acquired results would be N subtly different, but similar, equations.

Comment author: Ghatanathoah 01 November 2012 02:36:52PM -1 points [-]

But I doubt that morality is all in out genetic nature; I suspect that most of it is learned, from our parents, aunts, uncles, grandparents and other older relatives; I think, in short, that morality is memetic rather than genetic.

That's possible. But memetics can't build morality out of nothing. At the very least, evolved genetics has to provide a "foundation," a part of the brain that moral memes can latch onto. Sociopaths lack that foundation, although the research is inconclusive as to what extent this is caused by genetics, and what extent it is caused by later developmental factors (it appears to be a mix of some sort).

Hmmm. Looking at the wikipedia article, I can expect reflective equilibrium to produce a consistent moral framework. I also expect a correct moral framework to be consistent; but not all consistent moral frameworks are correct.

Yes, that's why I consider reflective equilibrium to be far from perfect. Depending on how many errors you latch onto, it might worsen your moral state.

Carrying through this method to completion could give us - or anyone else - an equation. But is there any way to be sure that it necessarily gives us the correct equation?

Considering how morally messed up the world is now, even an imperfect equation would likely be better (closer to being correct) than our current slapdash moral heuristics. At this point we haven't even achieved "good enough," so I don't think we should worry too much about being "perfect."

However, I anticipate that the acquired results would be N subtly different, but similar, equations.

That's not inconceivable. But I think that each of the subtly different equations would likely be morally better than pretty much every approximation we currently have.

Comment author: TheOtherDave 31 October 2012 03:12:46PM 0 points [-]

So, OK. Suppose, on this account, that you and I both care about morality to the same degree... that is, you don't care about morality more than I do, and I don't care about morality more than you do. (I'm not sure how we could ever know that this was the case, but just suppose hypothetically that it's true.)

Suppose we're faced with a situation in which there are two choices we can make. Choice A causes a system to be more fair, but less free. Choice B leaves that system unchanged. Suppose, for simplicity's sake, that those are the only two choices available, and we both have all relevant information about the system.

On your account, will we necessarily agree on which choice to make? Or is it possible, in that situation, that you might choose A and I choose B, or vice-versa?

Comment author: Ghatanathoah 31 October 2012 09:12:44PM *  0 points [-]

I think it depends on the degree of the change. If the change is very lopsided (i.e -100 freedom, +1 fairness) I think we'd both choose B.

If we assume that the degree of change is about the same (i.e. +1 fairness, -1 freedom) it would depend on how much freedom and fairness already exist. If the system is very fair, but very unfree, we'd both choose B, but if it's very free and very unfair we'd both choose A.

However, if we are to assume that the gain in fairness and the loss in freedom are of approximately equivalent size and the current system has fairly large amounts of both freedom and fairness (which I think is what you meant) then it might be possible that we'd have a disagreement that couldn't be resolved with pure reasoning.

This is called moral pluralism, the idea that there might be multiple moral values (such as freedom, fairness, and happiness) which are objectively correct, imperfectly commensurable with each other, and can be combined in different proportions that are of approximately equivalent objective moral value. If this is the case then your preference for one set of proportions over the other might be determined by arbitrary factors of your personality.

This is not the same as moral relativism, as these moral values are all objectively good, and any society that severely lacks one of them is objectively bad. It's just that there are certain combinations with different proportions of values that might be both "equally good," and personal preferences might be the "tiebreaker." To put it in more concrete terms, a social democracy with low economic regulation and a small welfare state might be "just as good" as a social democracy with slightly higher economic regulation and a slightly larger welfare state, and people might honestly and irresolvably disagree over which one is better. However, both of those societies would definitely be objectively better than Cambodia under the Khmer Rouge, and any rational, fully informed person who cares about morality would be able to see that.

Of course, if we are both highly rational and moral, and disagreed about A vs. B, we'd both agree that fighting over them excessively would be morally worse than choosing either of them, and find some way to resolve our disagreement, even if it meant flipping a coin.

Comment author: TheOtherDave 31 October 2012 10:14:48PM 0 points [-]

I agree with you that in sufficiently extreme cases, we would both make the same choice. Call that set of cases S1.

I think you're saying that if the case is not that extreme, we might not make the same choice, even though we both care equally about the thing you're using "morality" to refer to. I agree with that as well. Call that set of cases S2.

I also agree that even in S2, there's a vast class of options that we'd both agree are worse than either of our choices (as you illustrate with the Khmer Rouge), and a vast class of options that we'd both agree are better than either of our choices, supposing that we are as you suggest rational informed people who care about the thing you're using "morality" to refer to.

If I'm understanding you, you're saying in S2 we are making different decisions, but our decisions are equally good. Further, you're saying that we might not know that our decisions are equally good. I might make choice A and think choice B is wrong, and you might make choice B and think choice A is wrong. Being rational and well-informed people we'd agree that both A and B are better than the Khmer Rouge, and we might even agree that they're both better than fighting over which one to adopt, but it might still remain true that I think B is wrong and you think A is wrong, even though neither of us thinks the other choice is as wrong as the Khmer Rouge, or fighting about it, or setting fire to the building, or various other wrong things we might choose to evaluate.

Have I followed your position so far?

Comment author: Ghatanathoah 01 November 2012 04:23:51AM -1 points [-]

Yes, I think so.