AnnaSalamon comments on The Urgent Meta-Ethics of Friendly Artificial Intelligence - Less Wrong

45 Post author: lukeprog 01 February 2011 02:15PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (249)

You are viewing a single comment's thread. Show more comments above.

Comment author: AnnaSalamon 01 February 2011 10:04:23PM 3 points [-]

Folks who are anyhow heading into graduate school, and who have strengths and interests in social science, should perhaps consider focusing on moral psychology research.

But I'm not at all sure of that -- if someone is aiming at existential risk reduction, there are many other useful paths to consider, and a high opportunity cost to choosing one and not others.

Comment author: [deleted] 01 February 2011 10:21:10PM 3 points [-]

That's true -- I'm just trying to get a sense of what lukeprog is aiming at.

Just thinking out loud, for a moment: if AI really is an imminent possibility, AI strong enough that what it chooses to do is a serious issue for humanity's safety, and if we think that we can lessen the probability of disaster by defining and building moral machines, then it's very, very important to get our analysis right before anyone starts programming. (This is just my impression of what I've read from the site, please correct me if I misunderstood.) In which case, more moral psychology research (or research in other fields related to metaethics) is really important, unless you think that there's no further work to be done. Is it the best possible use of any one person's time? I'd say, probably not, except if you are already in an unusual position. There are not many top students or academics in these fields, and even fewer who have heard of existential risk; if you are one, and you want to, this doesn't seem like a terrible plan.

Comment author: lukeprog 01 February 2011 11:04:08PM 9 points [-]

I don't yet have much of an opinion on what the best way to do it is, I'm just saying it needs doing. We need more brains on the problem. Eliezer's meta-ethics is, I think, far from obviously correct. Moving toward normative ethics, CEV is also not obviously the correct solution for Friendly AI, though it is a good research proposal. The fate of the galaxy cannot rest on Eliezer's moral philosophy alone.

We need critically-minded people to say, "I don't think that's right, and here are four arguments why." And then Eliezer can argue back, or change his position. And then the others can argue back, or change their positions. This is standard procedure for solving difficult problems, but as of yet I haven't seen much published dialectic like this in trying to figure out the normative foundations for the Friendly AI project.

Let me give you an explicit example. CEV takes extrapolated human values as the source of an AI's eventually-constructed utility function. Is that the right way to go about things, or should we instead program an AI to figure out all the reasons for action that exist and account for them in its utility function, whether or not they happen to be reasons for action arising from the brains of a particular species of primate on planet Earth? What if there are 5 other intelligent species in the galaxy who interests will not at all be served when our Friendly AI takes over the galaxy? Is that really the right thing to do? How would we go about answering questions like that?

Comment author: Eliezer_Yudkowsky 02 February 2011 05:48:02PM 8 points [-]

or should we instead program an AI to figure out all the reasons for action that exist and account for them in its utility function

...this sentence makes me think that we really aren't on the same page at all with respect to naturalistic metaethics. What is a reason for action? How would a computer program enumerate them all?

Comment author: lukeprog 02 February 2011 08:47:41PM *  7 points [-]

A 'reason for action' is the standard term in Anglophone philosophy for a source of normativity of any kind. For example, a desire is the source of normativity in a hypothetical imperative. Others have proposed that categorical imperatives exist, and provide reasons for action apart from desires. Some have proposed that divine commands exist, and are sources of normativity apart from desires. Others have proposed that certain objects or states of affairs can ground normativity intrinsically - i.e. that they have intrinsic value apart from being valued by an agent.

A source of normativity (a reason for action) is anything that grounds/justifies an 'ought' or 'should' statement. Why should I look both ways before crossing the street? Presumably, this 'should' is justified by reference to my desires, which could be gravely thwarted if I do not look both ways before crossing the street. If I strongly desired to be run over by cars, the 'should' statement might no longer be justified. Some people might say I should look both ways anyway, because God's command to always look before crossing a street provides me with reason for action to do that even if it doesn't help fulfill my desires. But I don't believe that proposed reason for action exists.

Comment author: ata 02 February 2011 10:39:22PM *  7 points [-]

A 'reason for action' is the standard term in Anglophone philosophy for a source of normativity of any kind. For example, a desire is the source of normativity in a hypothetical imperative. Others have proposed that categorical imperatives exist, and provide reasons for action apart from desires. Some have proposed that divine commands exist, and are sources of normativity apart from desires. Others have proposed that certain objects or states of affairs can ground normativity intrinsically - i.e. that they have intrinsic value apart from being valued by an agent.

Okay, but all of those (to the extent that they're coherent) are observations about human axiology. Beware of committing the mind projection fallacy with respect to compellingness — you find those to be plausible sources of normativity because your brain is that of "a particular species of primate on planet Earth". If your AI were looking for "reasons for action" that would compel all agents, it would find nothing, and if it were looking for all of the "reasons for action" that would compel each possible agent, it would spend an infinite amount of time enumerating stupid pointless motivations. It would eventually notice categorical imperatives, fairness, compassion, etc. but it would also notice drives based on the phase of the moon, based on the extrapolated desires of submarines (according to any number of possible submarine-volition-extrapolating dynamics), based on looking at how people would want to be treated and reversing that, based on the number of living cats in the world modulo 241, based on modeling people as potted plants and considering the direction their leaves are waving...

Comment author: Eliezer_Yudkowsky 03 February 2011 12:45:55AM *  6 points [-]

Okay, see, this is why I have trouble talking to philosophers in their quote standard language unquote.

I'll ask again: How would a computer program enumerate all reasons for action?

Comment author: lukeprog 03 February 2011 03:43:56AM 8 points [-]

Eliezer,

I think the reason you're having trouble with the standard philosophical category of "reasons for action" is because you have the admirable quality of being confused by that which is confused. I think the "reasons for action" category is confused. At least, the only action-guiding norm I can make sense of is desire/preference/motive (let's call it motive). I should eat the ice cream because I have a motive to eat the ice cream. I should exercise more because I have many motives that will be fulfilled if I exercise. And so on. All this stuff about categorical imperatives or divine commands or intrinsic value just confuses things.

How would a computer program enumerate all motives (which according to me, is co-exensional with "all reasons for action")? It would have to roll up its sleeves and do science. As it expands across the galaxy, perhaps encountering other creatures, it could do some behavioral psychology and neuroscience on these creatures to decode their intentional action systems (as it had done already with us), and thereby enumerate all the motives it encounters in the universe, their strengths, the relations between them, and so on.

But really, I'm not yet proposing a solution. What I've described above doesn't even reflect my own meta-ethics. It's just an example. I'm merely raising questions that need to be considered very carefully.

And of course I'm not the only one to do so. Others have raised concerns about CEV and its underlying meta-ethical assumptions. Will Newsome raised some common worries about CEV and proposed computational axiology instead. Tarleton's 2010 paper compares CEV to an alternative proposed by Wallach & Collin.

The philosophical foundations of the Friendly AI project need more philosophical examination, I think. Perhaps you are very confident about your meta-ethical views and about CEV; I don't know. But I'm not confident about them. And as you say, we've only got one shot at this. We need to make sure we get it right. Right?

Comment author: Eliezer_Yudkowsky 03 February 2011 02:38:30PM *  13 points [-]

As it expands across the galaxy, perhaps encountering other creatures, it could do some behavioral psychology and neuroscience on these creatures to decode their intentional action systems

Now, it's just a wild guess here, but I'm guessing that a lot of philosophers who use the language "reasons for action" would disagree that "knowing the Baby-eaters evolved to eat babies" is a reason to eat babies. Am I wrong?

I'm merely raising questions that need to be considered very carefully.

I tend to be a bit gruff around people who merely raise questions; I tend to view the kind of philosophy I do as the track where you need some answers for a specific reason, figure them out, move on, and dance back for repairs if a new insight makes it necessary; and this being a separate track from people who raise lots of questions and are uncomfortable with the notion of settling on an answer. I don't expect those two tracks to meet much.

Comment author: lukeprog 03 February 2011 03:09:44PM 6 points [-]

I count myself among the philosophers who would say that "knowing the Baby-eaters want to eat babies" is not a reason (for me) to eat babies. Some philosophers don't even think that the Baby-eaters' desires to eat babies are reasons for them to eat babies, not even defeasible reasons.

I tend to be a bit gruff around people who merely raise questions

Interesting. I always assumed that raising a question was the first step toward answering it - especially if you don't want yourself to be the only person who tries to answer it. The point of a post like the one we're commenting on is that hopefully one or more people will say, "Huh, yeah, it's important that we get this issue right," and devote some brain energy to getting it right.

I'm sure the "figure it out and move on" track doesn't meet much with the "I'm uncomfortable settling on an answer" track, but what about the "pose important questions so we can work together to settle on an answer" track? I see myself on that third track, engaging in both the 'pose important questions' and the 'settle on an answer' projects.

Comment author: Eliezer_Yudkowsky 03 February 2011 04:27:19PM *  33 points [-]

Interesting. I always assumed that raising a question was the first step toward answering it

Only if you want an answer. There is no curiosity that does not want an answer. There are four very widespread failure modes around "raising questions" - the failure mode of paper-writers who regard unanswerable questions as a biscuit bag that never runs out of biscuits, the failure mode of the politically savvy who'd rather not offend people by disagreeing too strongly with any of them, the failure mode of the religious who don't want their questions to arrive at the obvious answer, the failure mode of technophobes who mean to spread fear by "raising questions" that are meant more to create anxiety by their raising than by being answered, and all of these easily sum up to an accustomed bad habit of thinking where nothing ever gets answered and true curiosity is dead.

So yes, if there's an interim solution on the table and someone says "Ah, but surely we must ask more questions" instead of "No, you idiot, can't you see that there's a better way" or "But it looks to me like the preponderance of evidence is actually pointing in this here other direction", alarms do go off inside my head. There's a failure mode of answering too prematurely, but when someone talks explicitly about the importance of raising questions - this being language that is mainly explicitly used within the failure-mode groups - alarms go off and I want to see it demonstrated that they can think in terms of definite answers and preponderances of evidence at all besides just raising questions; I want a demonstration that true curiosity, wanting an actual answer, isn't dead inside them, and that they have the mental capacity to do what's needed to that effect - namely, weigh evidence in the scales and arrive at a non-balanced answer, or propose alternative solutions that are supposed to be better.

I'm impressed with your blog, by the way, and generally consider you to be a more adept rationalist than the above paragraphs might imply - but when it comes to this particular matter of metaethics, I'm not quite sure that you strike me as aggressive enough that if you had twenty years to sort out the mess, I would come back twenty years later and find you with a sheet of paper with the correct answer written on it, as opposed to a paper full of questions that clearly need to be very carefully considered.

Comment author: endoself 12 February 2011 03:16:36AM 1 point [-]

I count myself among the philosophers who would say that "knowing the Baby-eaters want to eat babies" is not a reason (for me) to eat babies. Some philosophers don't even think that the Baby-eaters' desires to eat babies are reasons for them to eat babies, not even defeasible reasons.

Knowing the Baby-eaters want to eat babies is a reason for them to eat babies. It is not a reason for us to let them eat babies. My biggest problem with desirism in general is that it provides no reason for us to want to fulfill others' desires. Saying that they want to fulfill their desires is obvious. Whether we help or hinder them is based entirely on our own reasons for action.

Comment deleted 03 February 2011 03:46:22PM [-]
Comment author: lukeprog 24 February 2011 11:38:40PM 3 points [-]

For the record, I currently think CEV is the most promising path towards solving the Friendly AI problem, I'm just not very confident about any solutions yet, and am researching the possibilities as quickly as possible, using my outline for Ethics and Superintelligence as a guide to research. I have no idea what the conclusions in Ethics and Superintelligence will end up being.

Comment author: lukeprog 14 February 2011 06:48:00AM *  4 points [-]

Here's an interesting juxtaposition...

Eliezer-2011 writes:

I tend to be a bit gruff around people who merely raise questions; I tend to view the kind of philosophy I do as the track where you need some answers for a specific reason, figure them out, move on, and dance back for repairs if a new insight makes it necessary; and this being a separate track from people who raise lots of questions and are uncomfortable with the notion of settling on an answer. I don't expect those two tracks to meet much.

Eliezer-2007 quotes Robyn Dawes, saying that the below is "so true it's not even funny":

Norman R. F. Maier noted that when a group faces a problem, the natural tendency of its members is to propose possible solutions as they begin to discuss the problem. Consequently, the group interaction focuses on the merits and problems of the proposed solutions, people become emotionally attached to the ones they have suggested, and superior solutions are not suggested. Maier enacted an edict to enhance group problem solving: "Do not propose solutions until the problem has been discussed as thoroughly as possible without suggesting any."

...

I have often used this edict with groups I have led - particularly when they face a very tough problem, which is when group members are most apt to propose solutions immediately. While I have no objective criterion on which to judge the quality of the problem solving of the groups, Maier's edict appears to foster better solutions to problems.

Is this a change of attitude, or am I just not finding the synthesis?

Eliezer-2011 seems to want to propose solutions very quickly, move on, and come back for repairs if necessary. Eliezer-2007 advises that for difficult problems (one would think that FAI qualifies) we take our time to understand the relevant issues, questions, and problems before proposing solutions.

Comment author: Vladimir_Nesov 14 February 2011 11:44:57AM *  5 points [-]

There's a big different between "not immediately" and "never". Don't propose a solution immediately, but do at least have a detailed working guess at a solution (which can be used to move to the next problem) in a year. Don't "merely" raise a question, make sure that finding an answer is also part of the agenda.

Comment author: wedrifid 14 February 2011 07:03:53AM *  3 points [-]

I suggest that he still holds both of those positions (at least, I know I do so do not see why he wouldn't) but that they apply to slightly different contexts. Eliezer's elaboration in the descendant comments from the first quote seemed to illustrate why fairly well. They also, if I recall, allowed that you do not fit into the 'actually answering is unsophisticated' crowd, which further narrows down just what he is meaning.

Comment author: RichardKennaway 14 February 2011 08:40:53AM *  2 points [-]

It's a matter of the twelfth virtue of rationality, the intention to cut through to the answer, whatever the technique. The purpose of holding off on proposing solutions is to better find solutions, not to stop at asking the question.

Comment author: TheOtherDave 14 February 2011 04:43:40PM 1 point [-]

The impression I get is that EY-2011 believes that he has already taken the necessary time to understand the relevant issues, questions, and problems and that his proposed solution is therefore unlikely to be improved upon by further up-front thinking about the problem, rather than by working on implementing the solution he has in mind and seeing what difficulties come up.

Whether that's a change of attitude, IMHO, depends a lot on whether his initial standard for what counts as an adequate understanding of the relevant issues, questions, and problems was met, or whether it was lowered.

I'm not really sure what that initial standard was in the first place, so I have no idea which is the case. Nor am I sure it matters; presumably what matters more is whether the current standard is adequate.

Comment author: Desrtopa 15 February 2011 12:06:50AM *  0 points [-]

The point of the Dawes quote is to hold off on proposing solutions until you've thoroughly comprehended the issue, so that you get better solutions. It doesn't advocate discussing problems simply for the sake of discussing them. Between both quotes there's a consistent position that the point is to get the right answer, and discussing the question only has a point insofar as it leads to getting that answer. If you're discussing the question without proposing solutions ad infinitum, you're not accomplishing anything.

Comment author: Miller 14 February 2011 08:44:04AM 0 points [-]

Keep in mind that talking with regard to solutions is just so darn useful. Even if you propose an overly specific solution early, than it has a large surface area of features that can be attacked to prove it incompatible with the problem. You can often salvage and mutate what's left of the broken idea. There's not a lot of harm in that, rather there is a natural give and take whereby dismissing a proposed solution requires identifying what part of the problem requirements are contradicted, and it may very well not have occurred to you to specify that requirement in the first place.

I believe it has been observed that experts almost always talk in terms of candidate solutions, and amateurs attempt to build up from a platform of the problem itself. Experts of course having objectively better performance. The algorithm for provably moral superintelligences might not have a lot of prior solutions to draw from, but you could, for instance, find some inspiration even from the outside view of how some human political systems have maintained generally moral dispositions.

There is a bias to associate your status with ideas you have vocalized in the past since they reflect on the quality of your thinking, but you can't throw the baby out with the bathwater.

The Maier quote comes off as way to strong for me. And what's with this conclusion:

While I have no objective criterion on which to judge the quality of the problem solving of the groups, Maier's edict appears to foster better solutions to problems.

Comment author: NancyLebovitz 14 February 2011 07:37:33AM 0 points [-]

I think there's a synthesis possible. There's a purpose of finding a solid answer, but finding it requires a period of exploration rather than getting extremely specific in the beginning of the search.

Comment author: Perplexed 03 February 2011 03:04:08PM *  2 points [-]

If you don't spend much time on the track where people just raise questions, how do you encounter the new insights that make it necessary to dance back for repairs on your track?

Just asking. :)

Though I do tend to admire your attitude of pragmatism and impatience with those who dither forever.

Comment author: benelliott 03 February 2011 05:02:00PM 2 points [-]

I presume you encounter them later on. Maybe while doing more ground-level thinking about how to actually implement your meta-ethics you realise that it isn't quite coherent.

I'm not sure if this flying-by-the-seat-of-your-pants approach is best, but as has been pointed out before, there are costs associated with taking too long as well as with not being careful enough, there must come a point where the risk is too small and the time it would take to fix it too long.

Comment author: utilitymonster 03 February 2011 03:02:23PM *  1 point [-]

I can see that you might question the usefulness of the notion of a "reason for action" as something over and above the notion of "ought", but I don't see a better case for thinking that "reason for action" is confused.

The main worry here seems to have to do with categorical reasons for action. Diagnostic question: are these more troubling/confused than categorical "ought" statements? If so, why?

Perhaps I should note that philosophers talking this way make a distinction between "motivating reasons" and "normative reasons". A normative reason to do A is a good reason to do A, something that would help explain why you ought to do A, or something that counts in favor of doing A. A motivating reason just helps explain why someone did, in fact, do A. One of my motivating reasons for killing my mother might be to prevent her from being happy. By saying this, I do not suggest that this is a normative reason to kill my mother. It could also be that R would be a normative reason for me to A, but R does not motivate my to do A. (ata seems to assume otherwise, since ata is getting caught up with who these considerations would motivate. Whether reasons could work like this is a matter of philosophical controversy. Saying this more for others than you, Luke.)

Back to the main point, I am puzzled largely because the most natural ways of getting categorical oughts can get you categorical reasons. Example: simple total utilitarianism. On this view, R is a reason to do A if R is the fact that doing A would cause someone's well-being to increase. The strength of R is the extent to which that person's well-being increases. One weighs one's reasons by adding up all of their strengths. On then does the thing that one has most reason to do. (It's pretty clear in this case that the notion of a reason plays an inessential role in the theory. We can get by just fine with well-being, ought, causal notions, and addition.)

Utilitarianism, as always, is a simple case. But it seems like many categorical oughts can be thought of as being determined by weighing factors that count in favor of and count against the course of action in question. In these cases, we should be able to do something like what we did for util (though sometimes that method of weighing the reasons will be different/more complicated; in some bad cases, this might make the detour through reasons pointless).

The reasons framework seems a bit more natural in non-consequentialist cases. Imagine I try to maximize aggregate well-being, but I hate lying to do it. I might count the fact that an action would involve lying as a reason not to do it, but not believe that my lying makes the world worse. To get oughts out of a utility function instead, you might model my utility function as the result of adding up aggregate well-being and subtracting a factor that scales with the number of lies I would have to tell if I took the action in question. Again, it's pretty clear that you don't HAVE to think about things this way, but it is far from clear that this is confused/incoherent.

Perhaps the LW crowd is perplexed because people here take utility functions as primitive, whereas philosophers talking this way tend to take reasons as primitive and derive ought statements (and, on a very lucky day, utility functions) from them. This paper, which tries to help reasons folks and utility function folks understand/communicate with each other, might be helpful for anyone who cares much about this. My impression is that we clearly need utility functions, but don't necessarily need the reason talk. The main advantage to getting up on the reason talk would be trying to understand philosophers who talk that way, if that's important to you. (Much of the recent work in meta-ethics relies heavily on the notion of a normative reason, as I'm sure Luke knows.)

Comment author: lukeprog 03 February 2011 03:25:37PM *  1 point [-]

utilitymonster,

For the record, as a good old Humean I'm currently an internalist about reasons, which leaves me unable (I think) to endorse any form of utilitarianism, where utilitarianism is the view that we ought to maximize X. Why? Because internal reasons don't always, and perhaps rarely, support maximizing X, and I don't think external reasons for maximizing X exist. For example, I don't think X has intrinsic value (in Korsgaard's sense of "intrinsic value").

Thanks for the link to that paper on rational choice theories and decision theories!

Comment author: utilitymonster 03 February 2011 03:37:00PM 0 points [-]

So are categorical reasons any worse off than categorical oughts?

Comment author: Nick_Tarleton 03 February 2011 09:19:01AM 1 point [-]

Tarleton's 2010 paper compares CEV to an alternative proposed by Wallach & Collin.

Nitpick: Wallach & Collin are cited only for the term 'artificial moral agents' (and the paper is by myself and Roko Mijic). The comparison in the paper is mostly just to the idea of specifying object-level moral principles.

Comment author: lukeprog 03 February 2011 02:57:13PM 0 points [-]

Oops. Thanks for the correction.

Comment author: [deleted] 04 February 2011 02:18:35AM 7 points [-]

I wonder, since it's important to stay pragmatic, if it would be good to design a "toy example" for this sort of ethics.

It seems like the hard problem here is to infer reasons for action, from an individual's actions. People do all sorts of things; but how can you tell from those choices what they really value? Can you infer a utility function from people's choices, or are there sets of choices that don't necessarily follow any utility function?

The sorts of "toy" examples I'm thinking of here are situations where the agent has a finite number of choices. Let's say you have Pac-Man in a maze. His choices are his movements in four cardinal directions. You watch Pac-Man play many games; you see what he does when he's attacked by a ghost; you see what he does when he can find something tasty to eat; you see when he's willing to risk the danger to get the food.

From this, I imagine you could do some hidden Markov stuff to infer a model of Pac-Man's behavior -- perhaps an if-then tree.

Could you guess from this tree that Pac-Man likes fruit and dislikes dying, and goes away from fruit only when he needs to avoid dying? Yeah, you could (though I don't know how to systematize that more broadly.)

From this, could you do an "extrapolated" model of what Pac-Man would do if he knew when and where the ghosts were coming? Sure -- and that would be, if I've understood correctly, CEV for Pac-Man.

It seems to me that, more subtle philosophy aside, this is what we're trying to do. I haven't read the literature lukeprog has, but it seems to me that Pac-Man's "reasons for actions" are completely described by that if-then tree of his behavior. Why didn't he go left that time? Because there was a ghost there. Why does that matter? Because Pac-Man always goes away from ghosts. (You could say: Pac-Man desires to avoid ghosts.)

It also seems to me, not that I really know this line of work, that one incremental thing that can be done towards CEV (or some other sort of practical metaethics) is this kind of toy model. Yes, ultimately understanding human motivation is a huge psychology and neuroscience problem, but before we can assimilate those quantities of data we may want to make sure we know what to do in the simple cases.

Comment author: whpearson 04 February 2011 02:46:25AM 4 points [-]

Could you guess from this tree that Pac-Man likes fruit and dislikes dying, and goes away from fruit only when he needs to avoid dying? Yeah, you could (though I don't know how to systematize that more broadly.)

Something like:

Run simulations of agents that can chose randomly out of the same actions as the agent has. Look for regularities in the world state that occur more or less frequently in the sensible agent compared to random agent. Those things could be said to be what it likes and dislikes respectively.

To determine terminal vs instrumental values look at the decision tree and see which of the states gets chosen when a choice is forced.

Comment author: [deleted] 04 February 2011 02:48:17AM *  0 points [-]

Thanks. Come to think of it that's exactly the right answer.

Comment author: Nisan 04 February 2011 05:57:04PM 0 points [-]

Perhaps the next step would be to add to the model a notion of second-order desire, or analyze a Pac-Man whose apparent terminal values can change when they're exposed to certain experiences or moral arguments.

Comment author: bigjeff5 03 February 2011 12:50:56AM 5 points [-]

If you want to be run over by cars, you should still look both ways.

You might miss otherwise!

Comment author: Benquo 03 February 2011 02:51:03AM 0 points [-]

One way might be enough, in that case.

Comment author: bigjeff5 03 February 2011 05:31:48AM 0 points [-]

That depends entirely on the street, and the direction you choose to look. ;)

Comment author: Sniffnoy 03 February 2011 12:57:29AM 0 points [-]

Depends on how soon you insist it happen.

Comment author: lukeprog 02 February 2011 08:58:20PM 1 point [-]

Sorry... what I said above is not quite right. There are norms that are not reasons for action. For example, epistemological norms might be called 'reasons to believe.' 'Reasons for action' are the norms relevant to, for example, prudential normativity and moral normativity.

Comment author: jimrandomh 02 February 2011 10:21:41PM 3 points [-]

This is either horribly confusing, or horribly confused. I think that what's going on here is that you (or the sources you're getting this from) have taken a bundle of incompatible moral theories, identified a role that each of them has a part playing, and generalized a term from one of those theories inappropriately.

The same thing can be a reason for action, a reason for inaction, a reason for belief and a reason for disbelief all at once, in different contexts depending on what consequences these things will have. This makes me think that "reason for action" does not carve reality, or morality, at the joints.

Comment author: utilitymonster 03 February 2011 02:31:22PM *  0 points [-]

I'm sort of surprised by how people are taking the notion of "reason for action". Isn't this a familiar process when making a decision?

  1. For all courses of action you're thinking of taking, identify the features (consequences if you that's you think about things) that count in favor of taking that course of action and those that count against it.

  2. Consider how those considerations weigh against each other. (Do the pros outweigh the cons, by how much, etc.)

  3. Then choose the thing that does best in this weighing process.

The same thing can be a reason for action, a reason for inaction, a reason for belief and a reason for disbelief all at once, in different contexts depending on what consequences these things will have. This makes me think that "reason for action" does not carve reality, or morality, at the joints.

It is not a presupposition of the people talking this way that if R is a reason to do A in a context C, then R is a reason to do in all contexts.

The people talking this way also understand that a single R might be both a reason to do A and a reason to believe X at the same time. You could also have R be a reason to believe X and a reason to cause yourself to not believe X. Why do you think these things make the discourse incoherent/non-perspicuous? This seems no more puzzling than the familiar fact that believing a certain thing could be epistemically irrational but prudentially rational to (cause yourself) to believe.

Comment author: ata 01 February 2011 11:39:44PM *  6 points [-]

or should we instead program an AI to figure out all the reasons for action that exist and account for them in its utility function, whether or not they happen to be reasons for action arising from the brains of a particular species of primate on planet Earth?

All the reasons for action that exist? Like, the preferences of all possible minds? I'm not sure that utility function would be computable...

Edit: Actually, if we suppose that all minds are computable, then there's only a countably infinite number of possible minds, and for any mind with a utility function U(x), there is a mind somewhere in that set with the utility function -U(x). So, depending on how you weight the various possible utility functions, it may be that they'd all cancel out.

What if there are 5 other intelligent species in the galaxy who interests will not at all be served when our Friendly AI takes over the galaxy? Is that really the right thing to do? How would we go about answering questions like that?

Notice that you're a human but you care about that. If there weren't something in human axiology that could lead to sufficiently smart and reflective people concluding that nonhuman intelligent life is valuable, you wouldn't have even thought of that — and, indeed, it seems that in general as you look at smarter, more informed, and more thoughtful people, you see less provincialism and more universal views of ethics. And that's exactly the sort of thing that CEV is designed to take into account. Don't you think that there would be (at least) strong support for caring about the interests of other intelligent life, if all humans were far more intelligent, knowledgeable, rational, and consistent, and heard all the arguments for and against it?

And if we were all much smarter and still largely didn't think it was a good idea to care about the interests of other intelligent species... I really don't think that'll happen, but honestly, I'll have to defer to the judgment of our extrapolated selves. They're smarter and wiser than me, and they've heard more of the arguments and evidence than I have. :)

Comment author: Vladimir_Nesov 02 February 2011 12:06:12AM *  6 points [-]

Notice that you're a human but you care about that. If there weren't something in human axiology that could lead to sufficiently smart and reflective people concluding that nonhuman intelligent life is valuable, you wouldn't have even thought of that — and, indeed, it seems that in general as you look at smarter, more informed, and more thoughtful people, you see less provincialism and more universal views of ethics. And that's exactly the sort of thing that CEV is designed to take into account.

The same argument applies to just using one person as the template and saying that their preference already includes caring about all the other people.

The reason CEV might be preferable to starting from your own preference (I now begin to realize) is that the decision to privilege yourself vs. grant other people fair influence is also subject to morality, so to the extent you can be certain about this being more moral, it's what you should do. Fairness, also being merely a heuristic, is subject to further improvement, as can be inclusion of volition of aliens in the original definition.

Of course, you might want to fall back to a "reflective injunction" of not inventing overly elaborate plans, since you haven't had the capability of examining them well enough to rule them superior to more straightforward plans, such as using volition of a single human. But this is still a decision point, and the correct answer is not obvious.

Comment author: Nisan 04 February 2011 07:04:56PM 0 points [-]

The reason CEV might be preferable to starting from your own preference (I now begin to realize) is that the decision to privilege yourself vs. grant other people fair influence is also subject to morality, so to the extent you can be certain about this being more moral, it's what you should do.

This reminds me of the story of the people who encounter a cake, one of whom claims that what's "fair" is that they get all the cake for themself. It would be a mistake for us to come to a compromise with them on the meaning of "fair".

Does the argument for including everyone in CEV also argue for including everyone in a discussion of what fairness is?

Comment author: XiXiDu 02 February 2011 06:24:18PM *  3 points [-]

Don't you think that there would be (at least) strong support for caring about the interests of other intelligent life, if all humans were far more intelligent, knowledgeable, rational, and consistent, and heard all the arguments for and against it?

But making humans more intelligent, more rational would mean to alter their volition. An FAI that would proactively make people become more educated would be similar to one that altered the desires of humans directly. If it told them that the holy Qur'an is not the word of God it would dramatically change their desires. But what if people actually don't want to learn that truth? In other words, any superhuman intelligence will have a very strong observer effect and will cause a subsequent feedback loop that will shape the future according to the original seed AI, or the influence of its creators. You can't expect to create a God and still be able to extrapolate the natural desires of human beings. Human desires are not just a fact about their evolutionary history but also a mixture of superstructural parts like environmental and cultural influences. If you have some AI God leading humans into the future then at some point you have altered all those structures and consequently changed human volition. The smallest bias in the original seed AI will be maximized over time by the feedback between the FAI and its human pets.

ETA You could argue that all that matters is the evolutionary template for the human brain. The best way to satisfy it maximally is what we want, what is right. But leaving aside the evolution of culture and the environment seems drastic. Why not go a step further and create a new better mind as well?

I also think it is a mistake to generalize from the people you currently know to be intelligent and reasonable as they might be outliers. Since I am a vegetarian I am used to people telling me that they understand what it means to eat meat but that they don't care. We should not rule out the possibility that the extrapolated volition of humanity is actually something that would appear horrible and selfish to us "freaks".

I really don't think that'll happen, but honestly, I'll have to defer to the judgment of our extrapolated selves. They're smarter and wiser than me, and they've heard more of the arguments and evidence than I have.

That is only reasonable if matters of taste are really subject to rational argumentation and judgement. If it really doesn't matter if we desire pleasure or pain then focusing on smarts might either lead to an infinite regress or nihilism.

Comment author: TheOtherDave 02 February 2011 12:23:39AM 3 points [-]

Judging from his posts and comments here, I conclude that EY is less interested in dialectic than in laying out his arguments so that other people can learn from them and build on them. So I wouldn't expect critically-minded people to necessarily trigger such a dialectic.

That said, perhaps that's an artifact of discussion happening with a self-selected crowd of Internet denizens... that can exhaust anybody. So perhaps a different result would emerge if a different group of critically-minded people, people EY sees as peers, got involved. The Hanson/Yudkowsky debate about FOOMing had more of a dialectic structure, for example.

With respect to your example, the discussion here might be a starting place for that discussion, btw. The discussions here and here and here might also be salient.

Incidentally: the anticipated relationship between what humans want, what various subsets of humans want, and what various supersets including humans want, is one of the first questions I asked when I encountered the CEV notion.

I haven't gotten an explicit answer, but it does seem (based on other posts/discussions) that on EY's view a nonhuman intelligent species valuing something isn't something that should motivate our behavior at all, one way or another. We might prefer to satisfy that species' preferences, or we might not, but either way what should be motivating our behavior on EY's view is our preferences, not theirs. What matters on this view is what matters to humans; what doesn't matter to humans doesn't matter.

I'm not sure if I buy that, but satisfying "all the reasons for action that exist" does seem to be a step in the wrong direction.

Comment author: lukeprog 02 February 2011 01:17:52AM 0 points [-]

TheOtherDave,

Thanks for the links! I don't know what "satisfying all the reasons for action that exist" is the solution, but I listed it as an example alternative to Eliezer's theory. Do you have a preferred solution?

Comment author: TheOtherDave 02 February 2011 02:42:56AM 1 point [-]

Not really.

Rolling back to fundamentals: reducing questions about right actions to questions about likely and preferred results seems reasonable. So does treating the likely results of an action as an empirical question. So does approaching an individual's interests empirically, and as distinct from their beliefs about their interests, assuming they have any. The latter also allows for taking into account the interests of non-sapient and non-sentient individuals, which seems like a worthwhile goal.

Extrapolating a group's collective interests from the individual interests of its members is still unpleasantly mysterious to me, except in the fortuitous special case where individual interests happen to align neatly. Treating this as an optimization problem with multiple weighted goals is the best approach I know of, but I'm not happy with it; it has lots of problems I don't know how to resolve.

Much to my chagrin, some method for doing this seems necessary if we are to account for individual interests in groups whose members aren't peers (e.g., children, infants, fetuses, animals, sufferers of various impairments, minority groups, etc., etc., etc.), which seems good to address.

It's also at least useful to addressing groups of peers whose interests don't neatly align... though I'm more sanguine about marketplace competition as an alternative way of addressing that.

Something like this may also turn out to be critical for fully accounting for even an individual human's interests, if it turns out that the interests of the various sub-agents of a typical human don't align neatly, which seems plausible.

Accounting for the probable interests of probable entities (e.g., aliens) I'm even more uncertain about. I don't discount them a priori, but without a clearer understanding of such an accounting would actually look like I really don't know what to say about them. I guess if we have grounds for reliably estimating the probability of a particular interest being had by a particular entity, then it's just a subset of the general weighting problem, but... I dunno.

I reject accounting for the posited interests of counterfactual entities, although I can see where the line between that and probabilistic entities as above is hard to specify.

Does that answer your question?

Comment author: JGWeissman 01 February 2011 11:26:31PM 2 points [-]

To respond to your example (while agreeing that it is good to have more intelligent people evaluating things like CEV and the meta-ethics that motivates it):

I think the CEV approach is sufficiently meta that if we would conclude on meeting and learning about the aliens, and considering their moral significance, that the right thing to do involves giving weight to their preferences, then an FAI constructed from our current CEV would give weight to their preferences once it discovers them.

Comment author: Vladimir_Nesov 02 February 2011 01:06:10AM 2 points [-]

then an FAI constructed from our current CEV would give weight to their preferences once it discovers them.

If they are to be given weight at all, then this could as well be done in advance, so prior to observing aliens we give weight to preferences of all possible aliens, conditionally on future observations of which ones turn out to actually exist.

Comment author: JGWeissman 02 February 2011 01:46:21AM 0 points [-]

From a perspective of pure math, I think that is the same thing, but in considering practical computability, it does not seem like a good use of computing power to figure what weight to give the preference of a particular alien civilization out of a vast space of possible civilizations, until observing that the particular civilization exists.

Comment author: Vladimir_Nesov 02 February 2011 01:54:30AM 1 point [-]

Such considerations could have some regularities even across all the diverse possibilities, which are easy to notice with a Saturn-sized mind.

Comment author: jimrandomh 02 February 2011 07:07:06PM 0 points [-]

One such regularity comes to mind: most aliens would rather be discovered by a superintelligence that was friendly to them than not be discovered, so spreading and searching would optimize their preferences.