kip1981 comments on A Less Wrong singularity article? - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (210)
I would be surprised if Eliezer would cite Joshua Greene's moral anti-realist view with approval.
Correct. I'm a moral cognitivist; "should" statements have truth-conditions. It's just that very few possible minds care whether should-statements are true or not; most possible minds care about whether alien statements (like "leads-to-maximum-paperclips") are true or not. They would agree with us on what should be done; they just wouldn't care, because they aren't built to do what they should. They would similarly agree with us that their morals are pointless, but would be concerned with whether their morals are justified-by-paperclip-production, not whether their morals are pointless. And under ordinary circumstances, of course, they would never formulate - let alone bother to compute - the function we name "should" (or the closely related functions "justifiable" or "arbitrary").
Am I right in interpreting the bulk of the thread following this comment (excepting perhaps the FAI derail) as a dispute on the definition of "should"?
Yes, we're disputing definitions, which tends to become pointless. However, we can't seem to get away from using the word "should", so we might as well get it pinned down to something we can agree upon.
I think you are right.
The dispute also serves as a signal of what some parts of the disputants personal morality probably includes. This is fitting with the practical purpose that the concept 'should' has in general. Given what Eliezer has chosen as his mission this kind of signalling is a matter of life and death in the same way that it would have been in our environment of evolutionary adaptation. That is, if people had sufficient objection to Eliezer's values they would kill him rather than let him complete an AI.
Here's a very short unraveling of "should":
"Should" means "is such so as to fulfill the desires in question." For example, "If you want to avoid being identified while robbing a convenience store, you should wear a mask."
In the context of morality, the desires in question are all desires that exist. "You shouldn't rob convenience stores" means, roughly, "People in general have many and strong reasons to ensure that individuals don't want to rob convenience stores."
For the long version, see http://atheistethicist.blogspot.com/2005/12/meaning-of-ought.html .
I'm a moral cognitivist too but I'm becoming quite puzzled as to what truth-conditions you think "should" statements have. Maybe it would help if you said which of these you think are true statements.
1) Eliezer Yudkowsky should not kill babies.
2) Babyeating aliens should not kill babies.
3) Sharks should not kill babies.
4) Volcanoes should not kill babies.
5) Should not kill babies. (sic)
The meaning of "should not" in 2 through 5 are intended to be the same as the common usage of the words in 1.
Technically, you would need to include a caveat in all of those like, "unless to do so would advance paperclip production" but I assume that's what you meant.
I don't think there is one common usage of the word "should".
(ETA: I asked the nearest three people if "volcanoes shouldn't kill people" is true, false, or neither, assuming that "people shouldn't kill people" is true or false so moral non-realism wasn't an issue. One said true, two said neither.)
They all sound true to me.
Interesting, what about either of the following:
A) If X should do A, then it is rational for X to do A.
B) If it is rational for X to do A, then X should do A.
From what I understand of what Eliezer's position:
False
False.
(If this isn't the case then Eliezer's 'should' is even more annoying than how I now understand it.)
Yep, both false.
So, just to dwell on this for a moment, there exist X and A such that (1) it is rational for X to do A and (2) X should not do A.
How do you reconcile this with "rationalists should win"? (I think I know what your response will be, but I want to make sure.)
Here's my guess at one type of situation Eliezer might be thinking of when calling proposition B false: It is rational (let us stipulate) for a paperclip maximizer to turn all the matter in the solar system into computronium in order to compute ways to maximize paperclips, but "should" does not apply to paperclip maximizers.
Correct.
EDIT: If I were picking nits, I would say, "'Should' does apply to paperclip maximizers - it is rational for X to make paperclips but it should not do so - however, paperclip maximizers don't care and so it is pointless to talk about what they should do." But the overall intent of the statement is correct - I disagree with its intent in neither anticipation nor morals - and in such cases I usually just say "Correct". In this case I suppose that wasn't the best policy, but it is my usual policy.
What I think you mean is:
There is a function Should(human) (or Should(Eliezer)) which computes the human consensus (or Eliezer's opinion) on what the morally correct course of action is.
And some alien beliefs have their own Should function which would be, in form if not in content, similar to our own. So a paperclip maximiser doesn't get a should, as it simply follows a "figure out how to maximise paper clips - then do it" format. However a complex alien society that has many values and feels they must kill everyone else for the artistic cohesion of the universe, but often fails to act on this feeling because of akrasia, will get a Should(Krikkit) function.
However, until such time as we meet this alien civilization, we should just use Should as a shorthand for Should(human).
Is my understanding correct?
There could be a word defined that way, but for purposes of staying unconfused about morality, I prefer to use "would-want" so that "should" is reserved specifically for things that, you know, actually ought to be done.
"would-want" - under what circumstances? Superficially, it seems like pointless jargon. Is there a description somewhere of what it is supposed to mean?
Hmm. I guess not.
Fair enough. But are you saying that there is an objective standard of ought, or do you just mean a shared subjective standard? Or maybe a single subjective standard?
The word "ought" means a particular thing, refers to a particular function, and once you realize that, ought-statements have truth-values. There's just nothing which says that other minds necessarily care about them. It is also possible that different humans care about different things, but there's enough overlap that it makes sense (I believe, Greene does not) to use words like "ought" in daily communication.
What would the universe look like if there were such a thing as an "objective standard"? If you can't tell me what the universe looks like in this case, then the statement "there is an objective morality" is not false - it's not that there's a closet which is supposed to contain an objective morality, and we looked inside it, and the closet is empty - but rather the statement fails to have a truth-condition. Sort of like opening a suitcase that actually does contain a million dollars, and you say "But I want an objective million dollars", and you can't say what the universe would look like if the million dollars were objective or not.
I should write a post at some point about how we should learn to be content with happiness instead of "true happiness", truth instead of "ultimate truth", purpose instead of "transcendental purpose", and morality instead of "objective morality". It's not that we can't obtain these other things and so must be satisfied with what we have, but rather that tacking on an impressive adjective results in an impressive phrase that fails to mean anything. It is not that there is no ultimate truth, but rather, that there is no closet which might contain or fail to contain "ultimate truth", it's just the word "truth" with the sonorous-sounding adjective "ultimate" tacked on in front. Truth is all there is or coherently could be.
When you put those together like that it occurs to me that they all share the feature of being provably final. I.e., when you have true happiness you can stop working on happiness; when you have ultimate truth you can stop looking for truth; when you know an objective morality you can stop thinking about morality. So humans are always striving to end striving.
(Of course whether they'd be happy if they actually ended striving is a different question, and one you've written eloquently about in the "fun theory" series.)
That's actually an excellent way of thinking about it - perhaps the terms are not as meaningless as I thought.
Just a minor thought: there is a great deal of overlap on human "ought"s, but not so much on formal philosphical "ought"s. Dealing with philosophers often, I prefer to see ought as a function, so I can talk of "ought(Kantian)" and "ought(utilitarian)".
Maybe Greene has more encounters with formal philosophers than you, and thus cannot see much overlap?
Re: "The word "ought" means a particular thing, refers to a particular function, and once you realize that, ought-statements have truth-values."
A reveling and amazing comment - from my point of view. I had no idea you believed that.
What about alien "ought"s? Presumably you can hack the idea that aliens might see morality rather differently from us. So, presumably you are talking about ought<human> - glossing over our differences from one another.
There's a human morality in about the same sense as there's a human height.
There are no alien oughts, though there are alien desires and alien would-wants. They don't see morality differently from us; the criterion by which they choose is simply not that which we name morality.
This is a wonderful epigram, though it might be too optimistic. The far more pessimistic version would be "There's a human morality in about the same sense as there's a human language." (This is what Greene seems to believe and it's a dispute of fact.)
Eliezer, I think your proposed semantics of "ought" is confusing, and doesn't match up very well with ordinary usage. May I suggest the following alternative?
ought<X> refer's to X's would-wants if X is an individual. If X is a group, then ought<X> is the overlap between the oughts of its members.
In ordinary conversation, when people use "ought" without an explicit subscript or possessive, the implicit X is the speaker plus the intended audience (not humanity as a whole).
ETA: The reason we use "ought" is to convince the audience to do or not do something, right? Why would we want to refer to ought<humanity>, when ought<speaker+audience> would work just fine for that purpose, and ought<speaker+audience> covers a lot more ground than ought<humanity>?
That seems to hit close to the mark. Human language contains all sorts of features that are more or less universal to humans due to their hardware while also being significantly determined by cultural influences. It also shares the feature that certain types of language (and 'ought' systems) are more useful in different cultures or subcultures.
I'm not sure I follow this. Neither seem particularly pessimistic to me and I'm not sure how one could be worse than the other.
"There are no alien oughts" and "They don't see morality differently from us" - these seem like more bizarre-sounding views on the subject of morality - and it seems especially curious to hear them from the author of the "Baby-Eating Aliens" story.
Look, it's not very complicated: When you see Eliezer write "morality" or "oughts", read it as "human morality" and "human oughts".
Jumping recklessly in at the middle: even granting your premises regarding the scope of 'ought', it is not wholly clear that an alien "ought" is impossible. As timtyler pointed out, the Babyeaters in "Three Worlds Collide" probably had a would-want structure within the "ought" cluster in thingspace, and systems of behaviors have been observed in some nonhuman animals which resemble human morality.
I'm not saying it's likely, though, so this probably constitutes nitpicking.
Then we need a better way of distinguishing between what we're doing and what we would be doing if we were better at it.
You've written about the difference between rationality and believing that one's bad arguments are rational.
For the person who is in the latter state, something that might be called "true rationality" is unimaginable, but it exists.
Thanks, this has made your position clear. And - apart from tiny differences in vocabulary - it is exactly the same as mine.
So what about the Ultimate Showdown of Ultimate Destiny?
...sorry, couldn't resist.
But there is a truth-condition for whether a showdown is "ultimate" or not.
This sentence is much clearer than the sort of thing you usually say.
A single subjective standard. But he uses different terminology, with that difference having implications about how morality should (full Eliezer meaning) be thought about.
It can be superficially considered to be a shared subjective standard in as much as many other humans have morality that overlaps with his in some ways and also in the sense that his morality includes (if I recall correctly) the preferences of others somewhere within it. I find it curious that the final result leaves language and positions that are reminiscent of those begot by a belief in an objective standard of ought but without requiring totally insane beliefs like, say, theism or predicting that a uFAI will learn 'compassion' and become a FAI just because 'should' is embedded in the universe as an inevitable force or something.
Still, if I am to translate the Eliezer word into the language of Stuart_Armstrong it matches "a single subjective standard but I'm really serious about it". (Part of me wonders if Eliezer's position on this particular branch of semantics would be any different if there were less non-sequitur rejections of Bayesian statistics with that pesky 'subjective' word in it.)
I think you're just using different words to say the same thing that Greene is saying, you in particular use "should" and "morally right" in a nonstandard way - but I don't really care about the particular way you formulate the correct position, just as I wouldn't care if you used the variable "x" where Greene used "y" in an integral.
You do agree that you and Greene are actually saying the same thing, yes?
I don't think we anticipate different experimental results. We do, however, seem to think that people should do different things.
Whose version of "should" are you using in that sentence? If you're using the EY version of "should" then it is not possible for you and Greene to think people should do different things unless you and Greene anticipate different experimental results...
... since the EY version of "should" is (correct me if I am wrong) a long list of specific constraints and valuators that together define one specific utility function U _ humanmoralityaccordingtoEY. You can't disagree with Greene over what the concrete result of maximizing U _ humanmoralityaccordingtoEY is unless one of you is factually wrong.
Oh well in that case, we disagree about what reply we would hear if we asked a friendly AI how to talk and think about morality in order to maximize human welfare as construed in most traditional utilitarian senses.
This is phrased as a different observable, but it represents more of a disagreement about impossible possible worlds than possible worlds - we disagree about statements with truth conditions of the type of mathematical truth, i.e. which conclusions are implied by which premises. Though we may also have some degree of empirical disagreement about what sort of talk and thought leads to which personal-hedonic results and which interpersonal-political results.
(It's a good and clever question, though!)
Surely you should both have large error bars around the answer to that question in the form of fairly wide probability distributions over the set of possible answers. If you're both well-calibrated rationalists those distributions should overlap a lot. Perhaps you should go talk to Greene? I vote for a bloggingheads.
Wouldn't that be 'advocate', 'propose' or 'suggest'?
I vote no, it wouldn't be
Asked Greene, he was busy.
Yes, it's possible that Greene is correct about what humanity ought to do at this point, but I think I know a bit more about his arguments than he does about mine...
That is plausible.
I find that quite surprising to hear. Wouldn't disagreements about meaning generally cash out in some sort of difference in experimental results?
On your analysis of should, paperclip maximizers should not maximize paperclips. Do you think this is a more useful characterization of 'should' than one in which we should be moral and rational, etc., and paperclip maximizers should maximize paperclips?
A paperclip maximizer will maximize paperclips. I am unable to distinguish any sense in which this is a good thing. Why should I use the word "should" to describe this, when "will" serves exactly as well?
Please amplify on that. I can sorta guess what you mean, but can't be sure.
We make a distinction between the concepts of what people will do and what they should do. Is there an analogous pair of concepts applicable to paperclip maximizers? Why or why not? If not, what is the difference between people and paperclip maximizers that justifies there being this difference for people but not for paperclip maximizers?
Will paperclip maximizers, when talking about themselves, distinguish between what they will do, and what will maximize paperclips? (While wishing they'd be more paperclip maximizers they wish they were.) What they will actually do is distinct from what will maximize paperclips: it's predictable that actual performance is always less than optimal, given the problem is open-ended enough.
Let there be a mildly insane (after the fashion of a human) paperclipper named Clippy.
Clippy does A. Clippy would do B if a sane but bounded rationalist, C if an unbounded rationalist, and D if it had perfect veridical knowledge. That is, D is the actual paperclip-maximizing action, C is theoretically optimal given all of Clippy's knowledge, B is as optimal as C can realistically get under perfect conditions.
Is B, C, or D what Clippy Should(Clippy) do? This is a reason to prefer "would-want". Though I suppose a similar question applies to humans. Still, what Clippy should do is give up paperclips and become an FAI. There's no chance of arguing Clippy into that, because Clippy doesn't respond to what we consider a moral argument. So what's the point of talking about what Clippy should do, since Clippy's not going to do it? (Nor is it going to do B, C, or D, just A.)
PS: I'm also happy to talk about what it is rational for Clippy to do, referring to B.
Your usage of 'should' is more of a redefinition than clarification. B,C and D work as clarifications for the usual sense of the word: "should" has a feel 'meta' enough to transfer over to more kinds of agents.
If you can equally well talk of Should(Clippy) and Should(Humanity), then for the purposes of FAI it's Should that needs to be understood, not one particular sense should=Should(Humanity). If one can't explicitly write out Should(Humanity), one should probably write out Should(-), which is featureless enough for there to be no problem with the load of detailed human values, and in some sense pass Humanity as a parameter to its implementation. Do you see this framing as adequate or do you know of some problem with it?
This is a good framing for explaining the problem - you would not, in fact, try to build the same FAI for Clippies and humans, and then pass it humans as a parameter.
E.g. structural complications of human "should" that only the human FAI would have to be structurally capable of learning. (No, you cannot have complete structural freedom because then you cannot do induction.)
I expect you would build the same FAI for paperclipping (although we don't have any Clippies to pass it as parameter), so I'd appreciate it if you did explain the problem given you believe there is one, since it's a direction that I'm currently working.
Humans are stuff, just like any other feature of the world, that FAI would optimize, and on stuff-level it makes no difference that people prefer to be "free to optimize". You are "free to optimize" in a deterministic universe, it's the way this stuff is (being) arranged that makes the difference, and it's the content of human preference that says it shouldn't have some features like undeserved million-dollar bags falling from the sky, where undeserved is another function of stuff. An important subtlety of preference is that it makes different features of perhaps mutually exclusive possible scenarios depend on each other, so the fact that one should care about what could be and how it's related to what could be otherwise and even to how it's chosen what to actually realize is about scope of what preference describes, not about specific instance of preference. That is, in a manner of speaking, it's saying that you need an Int32, not a Bool to hold this variable, but that Int32 seems big enough.
Furthermore, considering the kind of dependence you described in that post you linked seems fundamental from a certain logical standpoint, for any system (not even "AI"). If you build the ontology for FAI on its epistemology, that is you don't consider it as already knowing anything but only as having its program that could interact with anything, then the possible futures and its own decision-making are already there (and it's all there is, from its point of view). All it can do, on this conceptual level, is to craft proofs (plans, designs of actions) that have the property of having certain internal dependencies in them, with the AI itself being the "current snapshot" of what it's planning. That's enough to handle the "free to optimize" requirement, given the right program.
Hmm, I'm essentially arguing that universal-enough FAI is "computable", that there is a program that computes a FAI for any given "creature", within a certain class of "creatures". I guess this problem is void, since obviously on the too-big-class side, for a small enough class this problem is in principle solvable, and for a big enough class it'll hit problems, if not conceptual then practical.
So the real question is about the characteristics of such class of systems for which it's easier to build an abstract FAI, that is a tool that takes a specimen of this class as a parameter and becomes a custom-made FAI for that specimen. This class needs to at least include humanity, and given the size of humanity's values, it needs to also include a lot of other stuff, for itself to be small enough to program explicitly. I currently expect a class of parameters of a manageable abstract FAI implementation to include even rocks and trees, since I don't see how to rigorously define and use in FAI theory the difference between these systems and us.
This also takes care of human values/humanity's values divide: these are just different systems to parameterize the FAI with, so there is no need for a theory of "value overlaps" distinct from a theory of "systems values". Another question is that "humanity" will probably be a bit harder to specify as parameter than some specific human or group of people.
Re: I suppose a similar question applies to humans.
Indeed - this objection is the same for any agent, including humans.
It doesn't seem to follow that the "should" term is inappropriate. If this is a reason for objecting to the "should" term, then the same argument concludes that it should not be used in a human context either.
'Will' does not serve exactly as well when considering agents with limited optimisation power (that is, any actual agent). Considering, for example, a Paperclip Maximiser that happens to be less intelligent than I am. I may be able to predict that Clippy will colonize Mars before he invades earth but also be quite sure that more paperclips would be formed if Clippy invaded Earth first. In this case I will likely want a word that means "would better serve to maximise the agent's expected utility even if the agent does not end up doing it".
One option is to take 'should' and make it the generic 'should<Agent>'. I'm not saying you should use 'should' (implicitly, 'should<Clippy>') to describe the action that Clippy would take if he had sufficient optimisation power. But I am saying that 'will' does not serve exactly as well.
I use "would-want" to indicate extrapolation. I.e., A wants X but would-want Y. This helps to indicate the implicit sensitivity to the exact extrapolation method, and that A does not actually represent a desire for Y at the current moment, etc. Similarly, A does X but would-do Y, A chooses X but would-choose Y, etc.
"Should" is a standard word for indicating moral obligation - it seems only sensible to use it in the context of other moral systems.
It's a good thing - from their point of view. They probably think that there should be more paperclips. The term "should" makes sense in the context of a set of preferences.
No, it's a paperclip-maximizing thing. From their point of view, and ours. No disagreement. They just care about what's paperclip-maximizing, not what's good.
This is not a real point of disagreement.
IMO, in this context, "good" just means "favoured by this moral system". An action that "should" be performed is just one that would be morally obligatory - according to the specified moral system. Both terms are relative to a set of moral standards.
I was talking as though a paperclip maximiser would have morals that reflected their values. You were apparently assuming the opposite. Which perspective is better would depend on which particular paperclip maximiser was being examined.
Personally, I think there are often good reasons for morals and values being in tune with one another.