Kawoomba comments on The genie knows, but doesn't care - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (515)
Ditto.
Cousin Itt, 'tis a hairy topic, so you're uniquely "suited" to offer strands of insights:
For all the supposedly hard and confusing concepts out there, few have such an obvious answer as the supposed dichotomy between "morality" and "utility function". This in itself is troubling, as too-easy-to-come-by answers trigger the suspicion that I myself am subject to some sort of cognitive error.
Many people I deem to be quite smart would disagree with you and I, on a question whose answer is pretty much inherent in the definition of the term "utility function" encompassing preferences of any kind, leaving no space for some holier-than-thou universal (whether human-universal, or "optimal", or "to be aspired to", or "neurotypical", or whatever other tortured notions I've had to read) moral preferences which are somehow separate.
Why do you reckon that other (or otherwise?) smart people come to different conclusions on this?
I guess they have strong intuitions saying that objective morality must exist, and aren't used to solving or dismissing philosophical problems by asking "what would be useful for building FAI?" From most other perspectives, the question does look open.
Moral preferences don't have to be separate to be disinct, they can be a subset. "Morality is either all your prefences, or none of your prefernces" is a false dichotomy.
Edit: Of course you can choose to call a subset of your preferences "moral", but why would that make them "special", or more worthy of consideration than any other "non-moral" preferences of comparative weight?
The "moral" subset of people's preferences has certain elements that differentiate it like e.g. an attempt at universalization.
Attempt at universalization, isn't that a euphemism for proselytizing?
Why would [an agent whose preferences do not much intersect with the "moral" preferences of some group of agents] consider such attempts at universalization any different from other attempts of other-optimizing, which is generally a hostile act to be defended against?
No, people attempt to 'proselytize' their non-moral preferences too. If I attempt to share my love of My Little Pony, that doesn't mean I consider it a need for you to also love it. Even if I preferred it for you share my love of it, it would still not be a moral obligation on your part.
By universalization I didn't mean any action done after the adoption of the moral preference in question, I meant the criterion that serves to label it as a 'moral injuction' in the first place. If your brain doesn't register it as an instruction defensible by something other that your personal preferences, if it doesn't register it as a universal principle, it doesn't register as a moral instruction in the first place.
What do you mean by "universal"? For any such "universally morally correct preference", what about the potentially infinite number of other agents not sharing it? Please explain.
I've already given an example above: In a choice between saving my own child and saving a thousand other children, let's say I prefer saving my child. "Save my child" is a personal preference, and my brain recognizes it as such. "Save the highest number of children" can be considered a impersonal/universal instruction.
If I wanted to follow my preferences but still nonetheless claim moral perfection, I could attempt to say that the rule is really "Every parent should seek to save their own child" -- and I might even convince myself to the same. But I wouldn't say that the moral principle is really "Everyone should first seek to save the child of Aris Katsaris", even if that's what I really really prefer.
EDIT TO ADD: Also far from a recipe for war, it seem to me that morality is the opposite: an attempt at reconciling different preferences, so that people only become hostile towards only those people that don't follow a much more limited set of instructions, rather than anything in the entire set of their preferences.
Why would you try to do away with your personal preferences, what makes them inferior (edit: speaking as one specific agent) to some blended average case of myriads of other humans? (Is it because of your mirror neurons? ;-)
Being you, you should strive towards that which you "really really prefer". If a particular "moral principle" (whatever you choose to label as such) is suboptimal for you (and you're not making choices for all of mankind, TDT or no), why would you endorse/glorify a suboptimal course of action?
That's called a compromise for mutual benefit, and it shifts as the group of agents changes throughout history. There's no need to elevate the current set of "mostly mutually beneficial actions" above anything but the fleeting accomodations and deals between roving tribes that they are. Best looked at through the prism of game theory.
Being me, I prefer what I "really really prefer". You've not indicated why I "should" strive towards that which I "really really prefer".
When you are asking whether I "would" do something, is different than when you ask whether I "should" do something. Morality helps drive my volition, but it isn't the sole decider.
If you want to claim that that's the historical/evolutionary reasons that the moral instinct evolved, I agree.
If you want to argue that that's what morality is, then I disagree. Morality can drive someone to sacrifice their lives for others, so it's obviously NOT always a "compromise for mutual benefit".
If you have a preference for morality, being moral is not doing away with that prrefence: it is allowing your altruistic prefences to override your selfish ones.
You may be on the receving end of someone else's self sacrifice at some point
Morality is a somewhat like chess in this respect - morality:optimal play::satisfying your preferences:winning. To simplify their preferences a bit, chess players want to win, but no individual chess player would claim that all other chess players should play poorly so he can win.
The key issue is that, whilst morality is not tautologously the same as preferences, a morally right action is, tautologously, what you should do.
So it is difficult to see on what grounds Mark can object to the FAIs wishes: if it tells him something is mortally right that is what he should do. And he can't have his own separate morality, because the idea is incoherent.
A distinction to be made: Mark can wish differently than the AI wishes, Mark can't morally object to the AI's wishes (if the AI follows morality).
Exactly because morality is not the same as preferences.
You can call a subset of your preferences moral, that's fine. Say, eating chocolate icecream, or helping a starving child. Let's take a randomly chosen "morally right action" A.
That, given your second paragraph, would have to be a preference which, what, maximizes Mark's utility, regardless of what the rest of his utility function actually looks like?
It seems to be trivial to construct a utility function (given any such action A) such as that doing A does not maximize said utility function. Give Mark a such a utility function and you got yourself a reductio ad absurdum.
So, if you define a subset of preferences named "morally right" thus that any such action needs to maximize (edit: or even 'not minimize') an arbitrary utility function, then obviously that subset is empty.
If Mark is capable of acting morally, he would have a preference for moral action which is strong enough to override other preferences. However,t hat is not really the point. Even if he is too weak-willed to do what the FAI says, he has no grounds to object to the FAI.
I can't see how that amount to more than the observation that not every agetn is capable of acting morally. Ho hum.
I don't see why. An agent should want to do what is morally right, but that doesn't mean an agent would want to. Their utility funciton might not allow them. But how could they object to be told what is right? The fault, surely, lies in themselves.
They can object because their preferences are defined by their utility function, full stop. That's it. They are not "at fault", or "in error", for not adopting some other preferences that some other agents deem to be "morally correct". They are following their programming, as you follow yours. Different groups of agents share different parts of their preferences, think Venn diagram.
If the oracle tells you "this action maximizes your own utility function, you cannot understand how", then yes the agent should follow the advice.
If the oracle told an agent "do this, it is morally right", the non-confused agent would ask "do you mean it maximizes my own utility function?". If yes, "thanks, I'll do that", if no "go eff yourself!".
You can call an agent "incapable of acting morally" because you don't like what it's doing, it needn't care. It might just as well call you "incapable of acting morally" if your circles of supposedly "morally correct actions" don't intersect.
I can't speak for cousin_it, natch, but for my own part I think it has to do with mutually exclusive preferences vs orthogonal/mutually reinforcing preferences. Using moral language is a way of framing a preference as mutually exclusive with other preferences.
That is... if you want A and I want B, and I believe the larger system allows (Kawoomba gets A AND Dave gets B), I'm more likely to talk about our individual preferences. If I don't think that's possible, I'm more likely to use universal language ("moral," "optimal," "right," etc.), in order to signal that there's a conflict to be resolved. (Well, assuming I'm being honest.)
For example, "You like chocolate, I like vanilla" does not signal a conflict; "Chocolate is wrong, vanilla is right" does.
Why stop at connotation and signalling? If there is a non-empty set of preferences whose satistfaction is inclined to lead to conflict, and a non-empty set of preferences that can be satisfied withotu conflict, then "morally relevant prefernece" can denote the members of the first set...which is not idenitcal to the set of all preferences.
For any such preference, you can immediately provide a utility function such that the corresponding agent would be very unhappy about that preference, and would give its life to prevent it.
Or do you mean "a set of preferences the implementation of which would on balance benefit the largest amount of agents the most"? That would change as the set of agents changes, so does the "correct" morality change too, then?
Also, why should I or anyone else particular care about about such preferences (however you define them), especially as the "on average" doesn't benefit me? Is it because evolutionary speaking, that's how what evolved? What our mirror neurons lead us towards? Wouldn't that just be a case of the naturalistic fallacy?
Sure. So what? Kids don't like teachers and criminals don't like the police..but they can't object to them, because "entitiy X is stopping from doing bad things and making me do good things" is no (rational, adult) objection.
If being moral increases your utility, it increases your utility -- what other sense of "benefitting me" is there?
If utility is the satisfaction of preferences, and you can have preferences that don't benefit you (such as doing heroin), increasing your utility doesn't necessarily benefit you.
If you can get utility out of paperclips, why can't you get it out of heorin? You're surely not saying that there is some sort of Objective utility that everyone ought to have in their UF's?
You can get utility out of heroin if you prefer to use it, which is an example of "benefiting me" and utility not being synonymous. I don't think there's any objective utility function for all conceivable agents, but as you get more specific in the kinds of agents you consider (i.e. humans), there are commonalities in their utility functions, due to human nature. Also, there are sometimes inconsistencies between (for lack of better terminology) what people prefer and what they really prefer - that is, people can act and have a preference to act in ways that, if they were to act differently, they would prefer the different act.
(Kids - teachers), (criminals - police), so is "morally correct" defined by the most powerful agents, then?
And if being moral (whatever it may mean) does not?
Adult, rational objections are objections that other agents might feel impelled to do somehting about, and so are not just based on "I don't like it"."I don't like it" is no objectio to "you should do your homework", etc.
Then you would belong to the set of Immoral Agents, AKA Bad People.
"You should do your homework (... because it is in your own long-term best interest, you just can't see that yet)" is in the interest of the kid, cf. an FAI telling you to do an action because it is in your interest. "You should jump out that window (... because it amuses me / because I call that morally good)" is not in your interest, you should not do that. In such cases, "I don't like that" is the most pertinent objection and can stand all on its own.
Boo bad people! What if we encountered aliens with "immoral" preferences?