TheAncientGeek comments on The genie knows, but doesn't care - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (515)
Mark was complaining he would not get "his" morality, not that he wouldn't get all his preferences satisified.
Individual moralities makes no sense to me, any more than private languages or personal currencies.
It is obvious to me that any morlaity will require concessions: AI-imposed morality is not special in that regard.
I don't understand your comment, and I no longer understand your grandparent comment either. Are you using a meaning of "morality" that is distinct from "preferences"? If yes, can you describe your assumptions in more detail? It's not just for my benefit, but for many others on LW who use "morality" and "preferences" interchangeably.
Do that many people really use them interchangeably? Would these people understand the questions "Do you prefer chocolate or vanilla ice-cream?" as completely identical in meaning to "Do you consider chocolate or vanilla as the morally superior flavor for ice-cream?"
I don't care about colloquial usage, sorry. Eliezer has a convincing explanation of why wishes are intertwined with morality ("there is no safe wish smaller than an entire human morality"). IMO the only sane reaction to that argument is to unify the concepts of "wishes" and "morality" into a single concept, which you could call "preference" or "morality" or "utility function", and just switch to using it exclusively, at least for AI purposes. I've made that switch so long ago that I've forgotten how to think otherwise.
I recommend you re-learn how to think otherwise so you can fool humans into thinking you're one of them ;-).
"Intertwined with" does not mean "the same as".
I am not convinced by the explanation. It also applies ot non-moral prefrences. If I have a lower priority non moral prefence to eat tasty food, and a higher priority preference to stay slim, I need to consider my higher priority preferece when wishing for yummy ice cream.
To be sure, an agent capable of acting morally will have morality among their higher priority preferences -- it has to be among the higher order preferences, becuase it has to override other preferences for the agent to act morally. Therefore, when they scan their higher prioriuty prefences, they will happen to encounter their moral preferences. But that does not mean any preference is necessarily a moral preference. And their moral prefences override other preferences which are therefore non-moral, or at least less moral.
Therefore morality si a subset of prefences, as common sense maintained all along.
IMO, it is better to keep ones options open.
I don't experience the emotions of moral outrage and moral approval whenever any of my preferences are hindered/satisfied -- so it seems evident that my moral circuitry isn't identical to my preference circuitry. It may overlap in parts, it may have fuzzy boundaries, but it's not identical.
My own view is that morality is the brain's attempt to extrapolate preferences about behaviours as they would be if you had no personal stakes/preferences about a situation.
So people don't get morally outraged at other people eating chocolate icecreams, even when they personally don't like chocolate icecreams, because they can understand that's a strictly personal preference. If they believe it to be more than personal preference and make it into e.g. "divine commandment" or "natural law", then moral outrage can occur.
That morality is a subjective attempt at objectivity explains many of the confusions people have about it.
The ice cream example is bad because the consequences are purely internal to the person consuming the ice cream. What if the chocolate ice cream was made with slave labour? Many people would then object to you buying it on moral grounds.
Eliezer has produced an argument I find convincing that morality is the back propagation of preference to the options of an intermediate choice. That is to say, it is "bad" to eat chocolate ice cream because it economically supports slavers, and I prefer a world without slavery. But if I didn't know about the slave-labour ice cream factory, my preference would be that all-things-being-equal you get to make your own choices about what you eat, and therefore I prefer that you choose (and receive) the one you want, which is your determination to make, not mine.
Do you agree with EY's essay on the nature of right-ness which I linked to?
That doesn't seem to be required for Eliezer's argument...
I guess the relevant question is, do you think FAI will need to treat morality differently from other preferences?
I would prefer a AI that followed my extrapolated preferences, than a AI that followed my morality. But a AI that followed my morality would be morally superior to an AI that followed my extrapolated preferences.
If you don't understand the distinction I'm making above, consider a case of the AI having to decide whether to save my own child vs saving a thousand random other children. I'd prefer the former, but I believe the latter would be the morally superior choice.
Is that idea really so hard to understand? Would you dismiss the distinction I'm making as merely colloquial language?
Wow there is so much wrapped up in this little consideration. The heart of the issue is that we (by which I mean you, but I share your delimma) have truly conflicting preferences.
Honestly I think you should not be afraid to say that saving your own child is the moral thing to do. And you don't have to give excuses either - it's not that “if everyone saved their own child, then everyone's child will be looked after” or anything like that. No, the desire to save your own child is firmly rooted in our basic drives and preferences, enough so that we can go quite far in calling it a basic foundational moral axiom. It's not actually axiomatic, but we can safely treat it as such.
At the same time we have a basic preference to seek social acceptance and find commonality with the people we let into our lives. This drives us to want outcomes that are universally or at least most-widely acceptable, and seek moral frameworks like utilitarianism which lead to these outcomes. Usually this drive is secondary to self-serving preferences for most people, and that is OK.
For some reason you've called making decisions in favor of self-serving drives "preferences" and decisions in favor of social drives "morality." But the underlying mechanism is the same.
"But wait, if I choose self-serving drives over social conformity, doesn't that lead to me to make the decision to save one life in exclusion to 1000 others?" Yes, yes it does. This massive sub-thread started with me objecting to the idea that some "friendly" AI somewhere could derive morality experimentally from my preferences or the collective preferences of humankind, make it consistent, apply the result universally, and that I'd be OK with that outcome. But that cannot work because there is not, and cannot be a universal morality that satisfies everyone - every one of those thousand other children have parents that want their kid to survive and would see your child dead if need be.
What do you mean by "should not"?
What do you mean by "OK"?
Show me the neurological studies that prove it.
Yes, and yet if none of the children were mine, and if I wasn't involved in the situation at all, I would say "save the 1000 children rather than the 1". And if someone else, also not personally involved, could make the choice and chose to flip a coin instead in order to decide, I'd be morally outraged at them.
You can now give me a bunch of reasons of why this is just preference, while at the same time EVERYTHING about it (how I arrive to my judgment, how I feel about the judgment of others) makes it a whole distinct category of its own. I'm fine with abolishing useless categories when there's no meaningful distinction, but all you people should stop trying to abolish categories where there pretty damn obviously IS one.
I suspect that he means something like 'Even though utilitarianism (on LW) and altruism (in general) are considered to be what morality is, you should not let that discourage you from asserting that selfishly saving your own child is the right thing to do". (Feel free to correct me if I'm wrong.)
I've explained to you twice now how the two underlying mechanisms are unified, and pointed to Eliezer's quite good explanation on the matter. I don't see the need to go through that again.
If you were offered a bunch of AIs with equivalent power, but following different mixtures of your moral and non-moral preferences, which one would you run? (I guess you're aware of the standard results saying a non-stupid AI must follow some one-dimensional utility function, etc.)
I guess whatever ratio of my moral and non-moral preferences best represents their effect on my volition.
My related but different thoughts here. In particular, I don't agree that emotions like moral outrage and approval are impersonal, though I agree that we often justify those emotions using impersonal language and beliefs.
I didn't say that moral outrage and approval are impersonal. Obviously nothing that a person does can truly be "impersonal". But it may be an attempt at impersonality.
The attempt itself provides a direction that significantly differentiates between moral preferences and non-moral preferences.
I didn't mean some idealized humanly-unrealizable notion of impersonality, I meant the thing we ordinarily use "impersonal" to mean when talking about what humans do.
Ditto.
Cousin Itt, 'tis a hairy topic, so you're uniquely "suited" to offer strands of insights:
For all the supposedly hard and confusing concepts out there, few have such an obvious answer as the supposed dichotomy between "morality" and "utility function". This in itself is troubling, as too-easy-to-come-by answers trigger the suspicion that I myself am subject to some sort of cognitive error.
Many people I deem to be quite smart would disagree with you and I, on a question whose answer is pretty much inherent in the definition of the term "utility function" encompassing preferences of any kind, leaving no space for some holier-than-thou universal (whether human-universal, or "optimal", or "to be aspired to", or "neurotypical", or whatever other tortured notions I've had to read) moral preferences which are somehow separate.
Why do you reckon that other (or otherwise?) smart people come to different conclusions on this?
I guess they have strong intuitions saying that objective morality must exist, and aren't used to solving or dismissing philosophical problems by asking "what would be useful for building FAI?" From most other perspectives, the question does look open.
Moral preferences don't have to be separate to be disinct, they can be a subset. "Morality is either all your prefences, or none of your prefernces" is a false dichotomy.
Edit: Of course you can choose to call a subset of your preferences "moral", but why would that make them "special", or more worthy of consideration than any other "non-moral" preferences of comparative weight?
The "moral" subset of people's preferences has certain elements that differentiate it like e.g. an attempt at universalization.
Attempt at universalization, isn't that a euphemism for proselytizing?
Why would [an agent whose preferences do not much intersect with the "moral" preferences of some group of agents] consider such attempts at universalization any different from other attempts of other-optimizing, which is generally a hostile act to be defended against?
No, people attempt to 'proselytize' their non-moral preferences too. If I attempt to share my love of My Little Pony, that doesn't mean I consider it a need for you to also love it. Even if I preferred it for you share my love of it, it would still not be a moral obligation on your part.
By universalization I didn't mean any action done after the adoption of the moral preference in question, I meant the criterion that serves to label it as a 'moral injuction' in the first place. If your brain doesn't register it as an instruction defensible by something other that your personal preferences, if it doesn't register it as a universal principle, it doesn't register as a moral instruction in the first place.
The key issue is that, whilst morality is not tautologously the same as preferences, a morally right action is, tautologously, what you should do.
So it is difficult to see on what grounds Mark can object to the FAIs wishes: if it tells him something is mortally right that is what he should do. And he can't have his own separate morality, because the idea is incoherent.
A distinction to be made: Mark can wish differently than the AI wishes, Mark can't morally object to the AI's wishes (if the AI follows morality).
Exactly because morality is not the same as preferences.
You can call a subset of your preferences moral, that's fine. Say, eating chocolate icecream, or helping a starving child. Let's take a randomly chosen "morally right action" A.
That, given your second paragraph, would have to be a preference which, what, maximizes Mark's utility, regardless of what the rest of his utility function actually looks like?
It seems to be trivial to construct a utility function (given any such action A) such as that doing A does not maximize said utility function. Give Mark a such a utility function and you got yourself a reductio ad absurdum.
So, if you define a subset of preferences named "morally right" thus that any such action needs to maximize (edit: or even 'not minimize') an arbitrary utility function, then obviously that subset is empty.
If Mark is capable of acting morally, he would have a preference for moral action which is strong enough to override other preferences. However,t hat is not really the point. Even if he is too weak-willed to do what the FAI says, he has no grounds to object to the FAI.
I can't see how that amount to more than the observation that not every agetn is capable of acting morally. Ho hum.
I don't see why. An agent should want to do what is morally right, but that doesn't mean an agent would want to. Their utility funciton might not allow them. But how could they object to be told what is right? The fault, surely, lies in themselves.
I can't speak for cousin_it, natch, but for my own part I think it has to do with mutually exclusive preferences vs orthogonal/mutually reinforcing preferences. Using moral language is a way of framing a preference as mutually exclusive with other preferences.
That is... if you want A and I want B, and I believe the larger system allows (Kawoomba gets A AND Dave gets B), I'm more likely to talk about our individual preferences. If I don't think that's possible, I'm more likely to use universal language ("moral," "optimal," "right," etc.), in order to signal that there's a conflict to be resolved. (Well, assuming I'm being honest.)
For example, "You like chocolate, I like vanilla" does not signal a conflict; "Chocolate is wrong, vanilla is right" does.
Why stop at connotation and signalling? If there is a non-empty set of preferences whose satistfaction is inclined to lead to conflict, and a non-empty set of preferences that can be satisfied withotu conflict, then "morally relevant prefernece" can denote the members of the first set...which is not idenitcal to the set of all preferences.
For any such preference, you can immediately provide a utility function such that the corresponding agent would be very unhappy about that preference, and would give its life to prevent it.
Or do you mean "a set of preferences the implementation of which would on balance benefit the largest amount of agents the most"? That would change as the set of agents changes, so does the "correct" morality change too, then?
Also, why should I or anyone else particular care about about such preferences (however you define them), especially as the "on average" doesn't benefit me? Is it because evolutionary speaking, that's how what evolved? What our mirror neurons lead us towards? Wouldn't that just be a case of the naturalistic fallacy?
Sure. So what? Kids don't like teachers and criminals don't like the police..but they can't object to them, because "entitiy X is stopping from doing bad things and making me do good things" is no (rational, adult) objection.
If being moral increases your utility, it increases your utility -- what other sense of "benefitting me" is there?
If utility is the satisfaction of preferences, and you can have preferences that don't benefit you (such as doing heroin), increasing your utility doesn't necessarily benefit you.
(Kids - teachers), (criminals - police), so is "morally correct" defined by the most powerful agents, then?
And if being moral (whatever it may mean) does not?
For my own part: denotationally, yes, I would understand "Do you prefer (that Dave eat) chocolate or vanilla ice cream?" and "Do you consider (Dave eating) chocolate ice cream or vanilla as the morally superior flavor for (Dave eating) ice cream?" as asking the same question.
Connotationally, of course, the latter has all kinds of (mostly ill-defined) baggage the former doesn't.