LESSWRONG
LW

Comment Permalink

This is a good framing for explaining the problem - you would not, in fact, try to build the same FAI for Clippies and humans, and then pass it humans as a parameter.

I expect you would build the same FAI for paperclipping (although we don't have any Clippies to pass it as parameter), so I'd appreciate it if you did explain the problem given you believe there is one, since it's a direction that I'm currently working.

Humans are stuff, just like any other feature of the world, that FAI would optimize, and on stuff-level it makes no difference that people prefer to be "free to optimize". You are "free to optimize" in a deterministic universe, it's the way this stuff is (being) arranged that makes the difference, and it's the content of human preference that says it shouldn't have some features like undeserved million-dollar bags falling from the sky, where undeserved is another function of stuff. An important subtlety of preference is that it makes different features of perhaps mutually exclusive possible scenarios depend on each other, so the fact that one should care about what could be and how it's related to what could be otherwise and even to how it's chosen what to actually realize is about scope of what preference describes, not about specific instance of preference. That is, in a manner of speaking, it's saying that you need an Int32, not a Bool to hold this variable, but that Int32 seems big enough.

Furthermore, considering the kind of dependence you described in that post you linked seems fundamental from a certain logical standpoint, for any system (not even "AI"). If you build the ontology for FAI on its epistemology, that is you don't consider it as already knowing anything but only as having its program that could interact with anything, then the possible futures and its own decision-making are already there (and it's all there is, from its point of view). All it can do, on this conceptual level, is to craft proofs (plans, designs of actions) that have the property of having certain internal dependencies in them, with the AI itself being the "current snapshot" of what it's planning. That's enough to handle the "free to optimize" requirement, given the right program.

Hmm, I'm essentially arguing that universal-enough FAI is "computable", that there is a program that computes a FAI for any given "creature", within a certain class of "creatures". I guess this problem is void, since obviously on the too-big-class side, for a small enough class this problem is in principle solvable, and for a big enough class it'll hit problems, if not conceptual then practical.

So the real question is about the characteristics of such class of systems for which it's easier to build an abstract FAI, that is a tool that takes a specimen of this class as a parameter and becomes a custom-made FAI for that specimen. This class needs to at least include humanity, and given the size of humanity's values, it needs to also include a lot of other stuff, for itself to be small enough to program explicitly. I currently expect a class of parameters of a manageable abstract FAI implementation to include even rocks and trees, since I don't see how to rigorously define and use in FAI theory the difference between these systems and us.

This also takes care of human values/humanity's values divide: these are just different systems to parameterize the FAI with, so there is no need for a theory of "value overlaps" distinct from a theory of "systems values". Another question is that "humanity" will probably be a bit harder to specify as parameter than some specific human or group of people.

See in context

31 A Less Wrong singularity article?

by Kaj_Sotala

17th Nov 2009

1 min read

215

31

Robin criticizes Eliezer for not having written up his arguments about the Singularity in a standard style and submitted them for publication. Others, too, make the same complaint: the arguments involved are covered over such a huge mountain of posts that it's impossible for most outsiders to seriously evaluate them. This is a problem for both those who'd want to critique the concept, and for those who tentatively agree and would want to learn more about it.

Since it appears (do correct me if I'm wrong!) that Eliezer doesn't currently consider it worth the time and effort to do this, why not enlist the LW community in summarizing his arguments the best we can and submit them somewhere once we're done? Minds and Machines will be having a special issue on transhumanism, cognitive enhancement and AI, with a deadline for submission in January; that seems like a good opportunity for the paper. Their call for papers is asking for submissions that are around 4000 to 12 000 words.

The paper should probably

Briefly mention some of the previous work about AI being near enough to be worth consideration (Kurzweil, maybe Bostrom's paper on the subject, etc.), but not dwell on it; this is a paper on the consequences of AI.
Devote maybe little less than half of its actual content to the issue of FOOM, providing arguments and references for building the case of a hard takeoff.
~~Devote the second half to discussing the question of FAI, with references to e.g. Joshua Greene's thesis and other relevant sources for establishing this argument.~~ Carl Shulman says SIAI is already working on a separate paper on this, so it'd be better for us to concentrate merely on the FOOM aspect.
Build on the content of Eliezer's various posts, taking their primary arguments and making them stronger by reference to various peer-reviewed work.
Include as authors everyone who made major contributions to it and wants to be mentioned; certainly make (again, assuming he doesn't object) Eliezer as the lead author, since this is his work we're seeking to convert into more accessible form.

I have created a wiki page for the draft version of the paper. Anyone's free to edit.

SingularityAI

Personal Blog

31

A Less Wrong singularity article?

New Comment

215 comments, sorted by

top scoring

Click to highlight new comments since: Today at 10:14 PM

Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

[-]Daniel_Burfoot16y120

I hope such a document addresses the Mañana response: yeah, sure, figuring out how to control the AIs and make sure they're friendly is important, but there's no time pressure. It's not like powerful AI is going to be here anytime soon. In fact it's probably impossible to figure out how to control the AI, since we still have no idea how it will work.

I expect this kind of response is common among AI researchers, who do believe in the possibility of AI, but, having an up-close view of the sorry state of the field, have trouble getting excited about prophecies of doom.

1CronoDAS16y

This accurately describes my current position.

[-]MendelSchmiedekamp16y110

It's as though no one here has ever heard of the bystander effect. The deadline is January 15th. Setting up a wiki page and saying "Anyone's free to edit." is the equivalent to killing this thing.

Also this is a philosophy, psychology, and technology journal, which means that despite the list of references for Singularity research you will also need to link this with the philosophical and/or public policy issues that the journal wants you to address (take a look at the two guest editors).

Another worry to me is that in all the back issues of this journal I looked over, the papers were almost always monographs (and baring that 2). I suspect that having many authors might kill the chances for this paper.

2Morendil15y

This prediction was right on the money. This is being tracked on PredictionBook by the way. I have some reservations about the usefulness of PB in general, but one thing that is quite valuable is its providing a central "diary" of upcoming predictions made at various dates in the past, that would otherwise be easy to forget.

0Kaj_Sotala16y

I know. I was hoping somebody'd take the initiative, but failing that I'll muster the time to actually contribute to the article at some point.

0Zack_M_Davis16y

Yeah, I'm having my doubts about the whole crowdsourcing thing, too. I've started hacking away at the wiki page today; wanna be coauthors?

[-]Nick_Tarleton16y70

Devote the second half to discussing the question of FAI, with references to e.g. Joshua Greene's thesis and other relevant sources for establishing this argument.

(Now that this is struck out it might not matter, but) I wonder if, in addition to possibly overrating Greene's significance as an exponent of moral irrealism, we don't overrate the significance of moral realism as an obstacle to understanding FAI ("shiny pitchforks"). I would expect the academic target audience of this paper, especially the more technical subset, to be metaethically... (read more)

[-]MichaelAnissimov16y70

The arguments that I found most compelling for a hard takeoff are found in LOGI part 3 and the Wiki interview with Eliezer from 2003 or so, for anyone who needs help on references or argument ideas from outside of the sequences.

4timtyler16y

"a point in time when the speed of technological progress becomes near-infinite (i.e., discontinuous), caused by advanced technologies" * http://www.acceleratingfuture.com/wiki/Wiki_Interview_With_Eliezer/The_Singularity "Near infinite" is mystical math. There's no such thing as "near infinite" in real maths. Things are either finite, or they are not.

1MichaelAnissimov16y

Yes Tim, that's absolutely correct. That alternate meaning is complete bullshit, but it exists nonetheless. Very unfortunate, but I see very few people taking an initiative towards stomping it out in the wider world.

7timtyler16y

I think: "2) a point in time when prediction is no longer possible (a.k.a., "Predictive Horizon")" ...is equally nonsensical. Eliezer seems to agree: "The Predictive Horizon never made much sense to me" ...and so does Nick, quoted later in the essay: "I think it is unfortunate that some people have made Unpredictability a defining feature of "the singularity". It really does tend to create a mental block." Robin Hanson thinks that the unpredictability idea is silly as well. Yet aren't these two the main justifications for using the "singularity" term in the first place? If the rate of progress is not about to shoot off to infinity, and there isn't going to be an event-horizon-like threshold at some future point in time, it seems to me that that's two of the major justifications for using the "singularity" term down the toilet. To me - following the agricultural/industrial terminology - it looks as though there will be an intelligence revolution - and then probably a molecular nanotechnology/robotics revolution not long after. Squishing those two concepts together into "singularity" paste offends my sense of the naming historical events. I think it is confusing, misleading, and pseudo-scientific. Please quit with the ridiculous singularity terminology! http://alife.co.uk/essays/the_singularity_is_nonsense/

0righteousreason16y

I thought similarly about LOGI part 3 (Seed AI). I actually thought of that immediately and put a link up to that on the wiki page.

[-]CarlShulman16y70

This is a great idea Kaj, thanks for taking the initiative.

As noted by others, one issue with the AI Risks chapter is that it attempts to cover so much ground. I would suggest starting with just hard take-off, or local take-off, and presenting a focused case for that, without also getting into the FAI questions. This could also cut back on some duplication of effort, as SIAI folk were already planning to submit a paper (refined from some work done for a recent conference) for that issue on "machine ethics for superintelligence", which will be dis... (read more)

2wedrifid16y

Is the first AT to really confuse bots or am I missing something technical?

1Zack_M_Davis16y

I think it was supposed to be a "dahtt."

0Kaj_Sotala16y

Thanks for the heads-up - if there's already a FAI paper in the works, then yes, it's certainly better for this one to concentrate merely on the FOOM aspect.

[-]kip198116y60

I would be surprised if Eliezer would cite Joshua Greene's moral anti-realist view with approval.

6Eliezer Yudkowsky16y

Correct. I'm a moral cognitivist; "should" statements have truth-conditions. It's just that very few possible minds care whether should-statements are true or not; most possible minds care about whether alien statements (like "leads-to-maximum-paperclips") are true or not. They would agree with us on what should be done; they just wouldn't care, because they aren't built to do what they should. They would similarly agree with us that their morals are pointless, but would be concerned with whether their morals are justified-by-paperclip-production, not whether their morals are pointless. And under ordinary circumstances, of course, they would never formulate - let alone bother to compute - the function we name "should" (or the closely related functions "justifiable" or "arbitrary").

5RobinZ16y

Am I right in interpreting the bulk of the thread following this comment (excepting perhaps the FAI derail) as a dispute on the definition of "should"?

2CronoDAS16y

Yes, we're disputing definitions, which tends to become pointless. However, we can't seem to get away from using the word "should", so we might as well get it pinned down to something we can agree upon.

0wedrifid16y

I think you are right. The dispute also serves as a signal of what some parts of the disputants personal morality probably includes. This is fitting with the practical purpose that the concept 'should' has in general. Given what Eliezer has chosen as his mission this kind of signalling is a matter of life and death in the same way that it would have been in our environment of evolutionary adaptation. That is, if people had sufficient objection to Eliezer's values they would kill him rather than let him complete an AI.

-4timtyler16y

The other way around is also of some concern: An intelligent machine might make one of its first acts the assassination of other machine intelligence researchers - unless it is explicitly told not to do that. I figure we are going to want machines that will obey the law. That should be part of any sensible machine morality proposal.

3Eliezer Yudkowsky16y

As you can see, RobinZ, I'm trying to cure a particular kind of confusion here. The way people deploy their mental categories has consequences. The problem here is that "should" is already bound fairly tightly to certain concepts, no matter what sort of verbal definitions people think they're deploying, and if they expand the verbal label beyond that, it has consequences for e.g. how they think aliens and AIs will work, and consequences for how they emotionally experience their own moralities.

6timtyler16y

It is odd how you apparently seem to think you are using the conventional definition of "should" - when you have several people telling you that your use of "should" and "ought" is counter-intuitive. Most people are familiar with the idea that there are different human cultures, with somewhat different notions of right and wrong - and that "should" is often used in the context of the local moral climate. For example: * If the owner of the restaurant serves you himself, you should still tip him; * You should not put your elbows on the table while you are eating; * Women should curtsey - "a little bob is quite sufficient".

3wedrifid16y

To be fair, there are several quite distinct ways in which 'should' is typically used. Eliezer's usage is one of them. It is used more or less universally by children and tends to be supplanted or supplemented as people mature with the 'local context' definition you mention and/or the 'best action for agent given his preferences' definition. In Eliezer's case he seems to have instead evolved and philosophically refined the child version. (I hasten to add that I imply only that he matured his moral outlook in other ways than by transitioning usage of those particular words in the most common manner.)

5timtyler16y

I can understand such usage. However, we have things like: "I'm trying to cure a particular kind of confusion here". The confusion he is apparently talking about is the conventional view of "ought" and "should" - and it doesn't need "curing". In fact, it helps us to understand the moral customs of other cultures - rather than labeling them as being full of "bad" heathens - who need to be brought into the light.

0Eliezer Yudkowsky16y

My use is not counterintuitive. The fact that it is the intuitive use - that only humans ever think of what they should do in the ordinary sense, while aliens do what is babyeating; that looking at a paperclipper's actions conveys no more information about what we should do than looking at evolution or a rockslide - is counterintuitive. If you tell me that "should" has a usage which is unrelated to "right", "good", and "ought", then that usage could be adapted for aliens.

6wedrifid16y

One of the standard usages is "doing this will most enhance your utility". As in "you should kill that motherf@#$%". This is distinct from 'right' and 'good' although 'ought' is used in the same way, albeit less frequently. It is advice, rather than exhortation.

8CronoDAS16y

Indeed. "The Pebblesorters should avoid making piles of 1,001 stones" makes perfect sense.

0timtyler16y

"Should" and "ought" actually have strong connotations of societal morality. Should you rob the bank? Should you have sex with the minor? Should you confess to the crime? Your personal utility is one thing - but "should" and "ought" often have more to do with what society thinks of your actions.

0wedrifid16y

Probably not. Probably not here. Hell no. "The Fifth" is the only significant law-item that I'm explicitly familiar with. And I'm not even American. More often what you want society to think of people's actions (either as a signal or as persuasion. I wonder which category my answers above fit into?).

6timtyler16y

It's counterintuitive to me - and I'm not the only one - if you look at the other comments here. Aliens could have the "right", "good", "ought" and "should" concept cluster - just as some other social animals can, or other tribes, or humans at other times. Basically, there are a whole bunch of possible and actual moral frameworks - and these words normally operate relative to the framework under consideration. There are some people who think that "right" and "wrong" have some kind of universal moral meaning. However most of those people are religious, and think morality comes straight from god - or some such nonsense.

-5Eliezer Yudkowsky16y

0wedrifid16y

It is the claims along the lines of 'truth value' that are most counterintuitive. The universality that you attribute to 'Right' also requires some translation.

2RobinZ16y

I see, and that is an excellent point. Daniel Dennett has taken a similar attitude towards qualia, if I interpret you correctly - he argues that the idea of qualia is so inextricably bound with its standard properties (his list goes ineffable, intrinsic, private, and directly or immediately apprehensible by the consciousness) that to describe a phenomenon lacking those properties by that term is as wrongheaded as using the term elan vital to refer to DNA. I withdraw my implied criticism.

0wedrifid16y

I absolutely do not want my FAI to be constrained by the law. If the FAI allows machine intelligence researchers to create an uFAI we will all die. An AI that values the law above the existence of me and my species is evil, not Friendly. I wouldn't want the FAI to kill such researchers unless it was unable to find a more appealing way to ensure future safety but I wouldn't dream of constraining it to either laws or politics. But come to think of it I don't want it to be sensible either. The Three Laws of Robotics may be a naive conception but that Zeroth law was a step in the right direction.

1timtyler16y

Re: If the FAI allows machine intelligence researchers to create an uFAI we will all die Yes, that's probably just the kind of paranoid delusional thinking that a psychopathic superintelligence with no respect for the law would use to justify its murder of academic researchers. Hopefully, we won't let it get that far. Constructing an autonomous tool that will kill people is conspiracy to murder - so hopefully the legal system will allow us to lock up researchers who lack respect for the law before they do some real damage. Assassinating your competitors is not an acceptable business practice. Hopefully, the researchers will learn the error of their ways before then. The first big and successful machine intelligence project may well be a collaboration. Help build my tool, or be killed by it - is a rather aggressive proposition - and I expect most researchers will reject it, and expend their energies elsewhere - hopefully on more law-abiding projects.

5wedrifid16y

You seem confused (or, perhaps, hysterical). A psychopathic superintelligence would have no need to justify anything it does to anyone. By including 'delusional' you appear to be claiming that an unfriendly super-intelligence would not likely cause the extinction of humanity. Was that your intent? If so, why do you suggest that the first actions of a FAI would be to kill AI researchers? Do you believe that a superintelligence will disagree with you about whether uFAI is a threat and that it will be wrong while you are right? That is a bizarre prediction. You seem to have a lot of faith in the law. I find this odd. Has it escaped your notice that a GAI is not constrained by country borders? I'm afraid most of the universe, even most of the planet, is out of your jurisdiction.

-3timtyler16y

Re: You seem confused (or, perhaps, hysterical). Uh, thanks :-( A powerful corporate agent not bound by the law might well choose to assassinate its potential competitors - if it thought it could get away with it. Its competitors are likely to be among those best placed to prevent it from meeting its goals. Its competitors don't have to want to destroy all humankind for it to want to eliminate them! The tiniest divergence between its goals and theirs could potentially be enough.

0MichaelVassar16y

It is a misconception to think of law as a set of rules. Even more so to understand them as a set of rules that apply to non-humans today. In addition, rules won't be very effective constraints on superintelligences.

5CronoDAS16y

Here's a very short unraveling of "should": "Should" means "is such so as to fulfill the desires in question." For example, "If you want to avoid being identified while robbing a convenience store, you should wear a mask." In the context of morality, the desires in question are all desires that exist. "You shouldn't rob convenience stores" means, roughly, "People in general have many and strong reasons to ensure that individuals don't want to rob convenience stores." For the long version, see http://atheistethicist.blogspot.com/2005/12/meaning-of-ought.html .

3wuwei16y

I'm a moral cognitivist too but I'm becoming quite puzzled as to what truth-conditions you think "should" statements have. Maybe it would help if you said which of these you think are true statements. 1) Eliezer Yudkowsky should not kill babies. 2) Babyeating aliens should not kill babies. 3) Sharks should not kill babies. 4) Volcanoes should not kill babies. 5) Should not kill babies. (sic) The meaning of "should not" in 2 through 5 are intended to be the same as the common usage of the words in 1.

[-]Clippy16y130

Technically, you would need to include a caveat in all of those like, "unless to do so would advance paperclip production" but I assume that's what you meant.

2Nick_Tarleton16y

I don't think there is one common usage of the word "should". (ETA: I asked the nearest three people if "volcanoes shouldn't kill people" is true, false, or neither, assuming that "people shouldn't kill people" is true or false so moral non-realism wasn't an issue. One said true, two said neither.)

0[anonymous]16y

I don't think there's one canonical common usage of the word "should". (I'm not sure whether to say that 2-5 are true, or that 2-4 are type errors and 5 is a syntax error.)

0Eliezer Yudkowsky16y

They all sound true to me.

1wuwei16y

Interesting, what about either of the following: A) If X should do A, then it is rational for X to do A. B) If it is rational for X to do A, then X should do A.

1wedrifid16y

From what I understand of what Eliezer's position: False False. (If this isn't the case then Eliezer's 'should' is even more annoying than how I now understand it.)

0Eliezer Yudkowsky16y

Yep, both false.

1komponisto16y

So, just to dwell on this for a moment, there exist X and A such that (1) it is rational for X to do A and (2) X should not do A. How do you reconcile this with "rationalists should win"? (I think I know what your response will be, but I want to make sure.)

1arundelo16y

Here's my guess at one type of situation Eliezer might be thinking of when calling proposition B false: It is rational (let us stipulate) for a paperclip maximizer to turn all the matter in the solar system into computronium in order to compute ways to maximize paperclips, but "should" does not apply to paperclip maximizers.

-2Eliezer Yudkowsky16y

Correct. EDIT: If I were picking nits, I would say, "'Should' does apply to paperclip maximizers - it is rational for X to make paperclips but it should not do so - however, paperclip maximizers don't care and so it is pointless to talk about what they should do." But the overall intent of the statement is correct - I disagree with its intent in neither anticipation nor morals - and in such cases I usually just say "Correct". In this case I suppose that wasn't the best policy, but it is my usual policy.

4AllanCrossman16y

Of course, Kant distinguished between two different meanings of "should": the hypothetical and the categorical. 1. If you want to be a better Go player, you should study the games of Honinbo Shusaku. 2. You should pull the baby off the rail track. This seems useful here...

2wedrifid16y

False. Be consistent.

3Stuart_Armstrong16y

What I think you mean is: There is a function Should(human) (or Should(Eliezer)) which computes the human consensus (or Eliezer's opinion) on what the morally correct course of action is. And some alien beliefs have their own Should function which would be, in form if not in content, similar to our own. So a paperclip maximiser doesn't get a should, as it simply follows a "figure out how to maximise paper clips - then do it" format. However a complex alien society that has many values and feels they must kill everyone else for the artistic cohesion of the universe, but often fails to act on this feeling because of akrasia, will get a Should(Krikkit) function. However, until such time as we meet this alien civilization, we should just use Should as a shorthand for Should(human). Is my understanding correct?

-2Eliezer Yudkowsky16y

There could be a word defined that way, but for purposes of staying unconfused about morality, I prefer to use "would-want" so that "should" is reserved specifically for things that, you know, actually ought to be done.

3timtyler16y

"would-want" - under what circumstances? Superficially, it seems like pointless jargon. Is there a description somewhere of what it is supposed to mean?

-1timtyler16y

Hmm. I guess not.

1Stuart_Armstrong16y

Fair enough. But are you saying that there is an objective standard of ought, or do you just mean a shared subjective standard? Or maybe a single subjective standard?

[-]Eliezer Yudkowsky16y140

The word "ought" means a particular thing, refers to a particular function, and once you realize that, ought-statements have truth-values. There's just nothing which says that other minds necessarily care about them. It is also possible that different humans care about different things, but there's enough overlap that it makes sense (I believe, Greene does not) to use words like "ought" in daily communication.

What would the universe look like if there were such a thing as an "objective standard"? If you can't tell me what the universe looks like in this case, then the statement "there is an objective morality" is not false - it's not that there's a closet which is supposed to contain an objective morality, and we looked inside it, and the closet is empty - but rather the statement fails to have a truth-condition. Sort of like opening a suitcase that actually does contain a million dollars, and you say "But I want an objective million dollars", and you can't say what the universe would look like if the million dollars were objective or not.

I should write a post at some point about how we should learn to be content with happiness ... (read more)

9dclayh16y

When you put those together like that it occurs to me that they all share the feature of being provably final. I.e., when you have true happiness you can stop working on happiness; when you have ultimate truth you can stop looking for truth; when you know an objective morality you can stop thinking about morality. So humans are always striving to end striving. (Of course whether they'd be happy if they actually ended striving is a different question, and one you've written eloquently about in the "fun theory" series.)

3Eliezer Yudkowsky16y

That's actually an excellent way of thinking about it - perhaps the terms are not as meaningless as I thought.

5Stuart_Armstrong16y

Just a minor thought: there is a great deal of overlap on human "ought"s, but not so much on formal philosphical "ought"s. Dealing with philosophers often, I prefer to see ought as a function, so I can talk of "ought(Kantian)" and "ought(utilitarian)". Maybe Greene has more encounters with formal philosophers than you, and thus cannot see much overlap?

5timtyler16y

Re: "The word "ought" means a particular thing, refers to a particular function, and once you realize that, ought-statements have truth-values." A reveling and amazing comment - from my point of view. I had no idea you believed that. What about alien "ought"s? Presumably you can hack the idea that aliens might see morality rather differently from us. So, presumably you are talking about ought - glossing over our differences from one another. There's a human morality in about the same sense as there's a human height.

4Eliezer Yudkowsky16y

There are no alien oughts, though there are alien desires and alien would-wants. They don't see morality differently from us; the criterion by which they choose is simply not that which we name morality. This is a wonderful epigram, though it might be too optimistic. The far more pessimistic version would be "There's a human morality in about the same sense as there's a human language." (This is what Greene seems to believe and it's a dispute of fact.)

5Wei Dai16y

Eliezer, I think your proposed semantics of "ought" is confusing, and doesn't match up very well with ordinary usage. May I suggest the following alternative? ought refer's to X's would-wants if X is an individual. If X is a group, then ought is the overlap between the oughts of its members. In ordinary conversation, when people use "ought" without an explicit subscript or possessive, the implicit X is the speaker plus the intended audience (not humanity as a whole). ETA: The reason we use "ought" is to convince the audience to do or not do something, right? Why would we want to refer to ought, when ought would work just fine for that purpose, and ought covers a lot more ground than ought?

4wedrifid16y

That seems to hit close to the mark. Human language contains all sorts of features that are more or less universal to humans due to their hardware while also being significantly determined by cultural influences. It also shares the feature that certain types of language (and 'ought' systems) are more useful in different cultures or subcultures. I'm not sure I follow this. Neither seem particularly pessimistic to me and I'm not sure how one could be worse than the other.

1RobinZ16y

Jumping recklessly in at the middle: even granting your premises regarding the scope of 'ought', it is not wholly clear that an alien "ought" is impossible. As timtyler pointed out, the Babyeaters in "Three Worlds Collide" probably had a would-want structure within the "ought" cluster in thingspace, and systems of behaviors have been observed in some nonhuman animals which resemble human morality. I'm not saying it's likely, though, so this probably constitutes nitpicking.

1timtyler16y

"There are no alien oughts" and "They don't see morality differently from us" - these seem like more bizarre-sounding views on the subject of morality - and it seems especially curious to hear them from the author of the "Baby-Eating Aliens" story.

6Furcas16y

Look, it's not very complicated: When you see Eliezer write "morality" or "oughts", read it as "human morality" and "human oughts".

4wedrifid16y

It isn't that simple either. Human morality contains a significant component of trying to coerce other humans into doing things that benefit you. Even on a genetic level humans come with significantly different ways of processing moral thoughts. What is often called 'personality', particularly in the context of 'personality type'. The translation I find useful is to read it as "Eliezer-would-want". By the definitions Eliezer has given us the two must be identical. (Except, perhaps if Eliezer has for some reason decided to make himself immoral a priori.)

3timtyler16y

Um, that's what I just said: "presumably you are talking about ought". We were then talking about the meaning of ought. There's also the issue of whether to discuss ought and ought - which are evidently quite different - due to the shifting moral zeitgeist.

0Furcas16y

Well then, I don't understand why you would find statements like "There are no alien [human oughts]" and "They don't see [human morality] differently from us" bizarre-sounding.

0timtyler16y

Having established EY meant ought, I was asking about ought. Maybe you are right - and EY misinterpreted me - and genuinely thought I was asking about ought. If so, that seems like a rather ridiculous question for me to be asking - and I'm surprised it made it through his sanity checker.

-7PrawnOfFate12y

2NancyLebovitz16y

Then we need a better way of distinguishing between what we're doing and what we would be doing if we were better at it. You've written about the difference between rationality and believing that one's bad arguments are rational. For the person who is in the latter state, something that might be called "true rationality" is unimaginable, but it exists.

2Stuart_Armstrong16y

Thanks, this has made your position clear. And - apart from tiny differences in vocabulary - it is exactly the same as mine.

2Kaj_Sotala16y

So what about the Ultimate Showdown of Ultimate Destiny? ...sorry, couldn't resist.

1Eliezer Yudkowsky16y

But there is a truth-condition for whether a showdown is "ultimate" or not.

-1Zack_M_Davis16y

This sentence is much clearer than the sort of thing you usually say.

1wedrifid16y

A single subjective standard. But he uses different terminology, with that difference having implications about how morality should (full Eliezer meaning) be thought about. It can be superficially considered to be a shared subjective standard in as much as many other humans have morality that overlaps with his in some ways and also in the sense that his morality includes (if I recall correctly) the preferences of others somewhere within it. I find it curious that the final result leaves language and positions that are reminiscent of those begot by a belief in an objective standard of ought but without requiring totally insane beliefs like, say, theism or predicting that a uFAI will learn 'compassion' and become a FAI just because 'should' is embedded in the universe as an inevitable force or something. Still, if I am to translate the Eliezer word into the language of Stuart_Armstrong it matches "a single subjective standard but I'm really serious about it". (Part of me wonders if Eliezer's position on this particular branch of semantics would be any different if there were less non-sequitur rejections of Bayesian statistics with that pesky 'subjective' word in it.)

2wuwei16y

On your analysis of should, paperclip maximizers should not maximize paperclips. Do you think this is a more useful characterization of 'should' than one in which we should be moral and rational, etc., and paperclip maximizers should maximize paperclips?

3Eliezer Yudkowsky16y

A paperclip maximizer will maximize paperclips. I am unable to distinguish any sense in which this is a good thing. Why should I use the word "should" to describe this, when "will" serves exactly as well?

4Vladimir_Nesov16y

Please amplify on that. I can sorta guess what you mean, but can't be sure. We make a distinction between the concepts of what people will do and what they should do. Is there an analogous pair of concepts applicable to paperclip maximizers? Why or why not? If not, what is the difference between people and paperclip maximizers that justifies there being this difference for people but not for paperclip maximizers? Will paperclip maximizers, when talking about themselves, distinguish between what they will do, and what will maximize paperclips? (While wishing they'd be more paperclip maximizers they wish they were.) What they will actually do is distinct from what will maximize paperclips: it's predictable that actual performance is always less than optimal, given the problem is open-ended enough.

6Eliezer Yudkowsky16y

Let there be a mildly insane (after the fashion of a human) paperclipper named Clippy. Clippy does A. Clippy would do B if a sane but bounded rationalist, C if an unbounded rationalist, and D if it had perfect veridical knowledge. That is, D is the actual paperclip-maximizing action, C is theoretically optimal given all of Clippy's knowledge, B is as optimal as C can realistically get under perfect conditions. Is B, C, or D what Clippy Should(Clippy) do? This is a reason to prefer "would-want". Though I suppose a similar question applies to humans. Still, what Clippy should do is give up paperclips and become an FAI. There's no chance of arguing Clippy into that, because Clippy doesn't respond to what we consider a moral argument. So what's the point of talking about what Clippy should do, since Clippy's not going to do it? (Nor is it going to do B, C, or D, just A.) PS: I'm also happy to talk about what it is rational for Clippy to do, referring to B.

3Vladimir_Nesov16y

Your usage of 'should' is more of a redefinition than clarification. B,C and D work as clarifications for the usual sense of the word: "should" has a feel 'meta' enough to transfer over to more kinds of agents. If you can equally well talk of Should(Clippy) and Should(Humanity), then for the purposes of FAI it's Should that needs to be understood, not one particular sense should=Should(Humanity). If one can't explicitly write out Should(Humanity), one should probably write out Should(-), which is featureless enough for there to be no problem with the load of detailed human values, and in some sense pass Humanity as a parameter to its implementation. Do you see this framing as adequate or do you know of some problem with it?

2Eliezer Yudkowsky16y

This is a good framing for explaining the problem - you would not, in fact, try to build the same FAI for Clippies and humans, and then pass it humans as a parameter. E.g. structural complications of human "should" that only the human FAI would have to be structurally capable of learning. (No, you cannot have complete structural freedom because then you cannot do induction.)

1Vladimir_Nesov16y

I expect you would build the same FAI for paperclipping (although we don't have any Clippies to pass it as parameter), so I'd appreciate it if you did explain the problem given you believe there is one, since it's a direction that I'm currently working. Humans are stuff, just like any other feature of the world, that FAI would optimize, and on stuff-level it makes no difference that people prefer to be "free to optimize". You are "free to optimize" in a deterministic universe, it's the way this stuff is (being) arranged that makes the difference, and it's the content of human preference that says it shouldn't have some features like undeserved million-dollar bags falling from the sky, where undeserved is another function of stuff. An important subtlety of preference is that it makes different features of perhaps mutually exclusive possible scenarios depend on each other, so the fact that one should care about what could be and how it's related to what could be otherwise and even to how it's chosen what to actually realize is about scope of what preference describes, not about specific instance of preference. That is, in a manner of speaking, it's saying that you need an Int32, not a Bool to hold this variable, but that Int32 seems big enough. Furthermore, considering the kind of dependence you described in that post you linked seems fundamental from a certain logical standpoint, for any system (not even "AI"). If you build the ontology for FAI on its epistemology, that is you don't consider it as already knowing anything but only as having its program that could interact with anything, then the possible futures and its own decision-making are already there (and it's all there is, from its point of view). All it can do, on this conceptual level, is to craft proofs (plans, designs of actions) that have the property of having certain internal dependencies in them, with the AI itself being the "current snapshot" of what it's planning. That's enough to handle the "free

0timtyler16y

Re: I suppose a similar question applies to humans. Indeed - this objection is the same for any agent, including humans. It doesn't seem to follow that the "should" term is inappropriate. If this is a reason for objecting to the "should" term, then the same argument concludes that it should not be used in a human context either.

1wedrifid16y

'Will' does not serve exactly as well when considering agents with limited optimisation power (that is, any actual agent). Considering, for example, a Paperclip Maximiser that happens to be less intelligent than I am. I may be able to predict that Clippy will colonize Mars before he invades earth but also be quite sure that more paperclips would be formed if Clippy invaded Earth first. In this case I will likely want a word that means "would better serve to maximise the agent's expected utility even if the agent does not end up doing it". One option is to take 'should' and make it the generic 'should'. I'm not saying you should use 'should' (implicitly, 'should') to describe the action that Clippy would take if he had sufficient optimisation power. But I am saying that 'will' does not serve exactly as well.

1Eliezer Yudkowsky16y

I use "would-want" to indicate extrapolation. I.e., A wants X but would-want Y. This helps to indicate the implicit sensitivity to the exact extrapolation method, and that A does not actually represent a desire for Y at the current moment, etc. Similarly, A does X but would-do Y, A chooses X but would-choose Y, etc.

-1timtyler16y

"Should" is a standard word for indicating moral obligation - it seems only sensible to use it in the context of other moral systems.

-1timtyler16y

It's a good thing - from their point of view. They probably think that there should be more paperclips. The term "should" makes sense in the context of a set of preferences.

1Eliezer Yudkowsky16y

No, it's a paperclip-maximizing thing. From their point of view, and ours. No disagreement. They just care about what's paperclip-maximizing, not what's good.

0timtyler16y

This is not a real point of disagreement. IMO, in this context, "good" just means "favoured by this moral system". An action that "should" be performed is just one that would be morally obligatory - according to the specified moral system. Both terms are relative to a set of moral standards. I was talking as though a paperclip maximiser would have morals that reflected their values. You were apparently assuming the opposite. Which perspective is better would depend on which particular paperclip maximiser was being examined. Personally, I think there are often good reasons for morals and values being in tune with one another.

0SforSingularity16y

I think you're just using different words to say the same thing that Greene is saying, you in particular use "should" and "morally right" in a nonstandard way - but I don't really care about the particular way you formulate the correct position, just as I wouldn't care if you used the variable "x" where Greene used "y" in an integral. You do agree that you and Greene are actually saying the same thing, yes?

3Eliezer Yudkowsky16y

I don't think we anticipate different experimental results. We do, however, seem to think that people should do different things.

4SforSingularity16y

Whose version of "should" are you using in that sentence? If you're using the EY version of "should" then it is not possible for you and Greene to think people should do different things unless you and Greene anticipate different experimental results... ... since the EY version of "should" is (correct me if I am wrong) a long list of specific constraints and valuators that together define one specific utility function U humanmoralityaccordingtoEY. You can't disagree with Greene over what the concrete result of maximizing U humanmoralityaccordingtoEY is unless one of you is factually wrong.

3Eliezer Yudkowsky16y

Oh well in that case, we disagree about what reply we would hear if we asked a friendly AI how to talk and think about morality in order to maximize human welfare as construed in most traditional utilitarian senses. This is phrased as a different observable, but it represents more of a disagreement about impossible possible worlds than possible worlds - we disagree about statements with truth conditions of the type of mathematical truth, i.e. which conclusions are implied by which premises. Though we may also have some degree of empirical disagreement about what sort of talk and thought leads to which personal-hedonic results and which interpersonal-political results. (It's a good and clever question, though!)

4SforSingularity16y

Surely you should both have large error bars around the answer to that question in the form of fairly wide probability distributions over the set of possible answers. If you're both well-calibrated rationalists those distributions should overlap a lot. Perhaps you should go talk to Greene? I vote for a bloggingheads.

1Eliezer Yudkowsky16y

Asked Greene, he was busy. Yes, it's possible that Greene is correct about what humanity ought to do at this point, but I think I know a bit more about his arguments than he does about mine...

1SforSingularity16y

That is plausible.

-1wedrifid16y

Wouldn't that be 'advocate', 'propose' or 'suggest'?

1bgrah44916y

I vote no, it wouldn't be

0wuwei16y

I find that quite surprising to hear. Wouldn't disagreements about meaning generally cash out in some sort of difference in experimental results?

[-]JamesAndrix16y60

This is a problem for both those who'd want to critique the concept, and for those who are more open-minded and would want to learn more about it.

Nit: This implies that people who disagree are closed minded.

2Kaj_Sotala16y

Good point, that wasn't intentional. I'll edit it.

[-]Arenamontanus16y50

I think this is very needed. When reviewing singularity models for a paper I wrote I could not find many readily citable references to certain areas that I know exist as "folklore". I don't like mentioning such ideas because it makes it look (to outsiders) as I have come up with them, and the insiders would likely think I was trying to steal credit.

There are whole fields like friendly AI theory that need a big review. Both to actually gather what has been understood, and in order to make it accessible to outsiders so that the community thinking ... (read more)

[-]RobinHanson16y50

A problem with this proposal is whether this paper can be seen as authorative. A critic might worry that if they study and respond to this paper they will be told it does not represent the best pro-Singularity arguments. So the paper would need to be endorsed enough to gain enough status to become worth criticizing.

6Arenamontanus16y

The way to an authoritative paper is not just to have the right co-authors but mainly having very good arguments, cover previous research well and ensure that it is out early in an emerging field. That way it will get cited and used. In fact, one strong reason to write this paper now is that if you don't do it, somebody else (and perhaps much worse) will do it.

3righteousreason16y

Eliezer is arguing about one view of the Singularity, though there are others. This is one reason I thought to include http://yudkowsky.net/singularity/schools on the wiki. If leaders/proponents of the other two schools could acknowledge this model Eliezer has described of there being three schools of the Singularity, I think that might lend it more authority as you are describing.

4Kaj_Sotala16y

Actually, I might prefer not to use the term 'Singularity' at all, precisely because it has picked up so many different meanings. If a name is needed for the event we're describing and we can't avoid that, use 'intelligence explosion'.

2UnholySmoke16y

Seconded. One of the many modern connotations of 'Singularity' is 'Geek Apocalypse'. Which is happening, like, a good couple of years afterwards. Intelligence explosion does away with that, and seems to nail the concept much better anyway.

3Vladimir_Nesov16y

In what contexts is the action you mention worth performing? Why are "critics" a relevant concern? In my perception, normal technical science doesn't progress by criticism, it works by improving on some of existing work and forgetting the rest. New developments allow to see some old publications as uninteresting or wrong.

3mormon216y

"In what contexts is the action you mention worth performing?" If the paper was endorsed by the top minds who support the singularity. Ideally if it was written by them. So for example Ray Kurzweil whether you agree with him or not he is a big voice for the singularity. "Why are "critics" a relevant concern?" Because technical science moves forward through peer-review and the proving and the disproving of hypotheses. The critics help prevent the circle jerk phenomena in science assuming they are well thought out critiques. Because outside review can sometimes see fatal flaws in ideas that are not necessarily caught by those who work in the field. "In my perception, normal technical science doesn't progress by criticism, it works by improving on some of existing work and forgetting the rest. New developments allow to see some old publications as uninteresting or wrong." Have you ever published in a peer-review journal? If not the last portion of your post I will ignore, if so perhaps your could expound on it a bit more.

1Vladimir_Nesov16y

The actual experience of publishing a paper hardly adds anything that can't be understood without doing so. Peer-review is not about "critics" responding to endorsement by well-known figures, it's quality control (with whatever failing it may carry), and not a point where written-up public criticisms originate. Science builds on what's published, not on what gets rejected by peer review, and what's published can be read by all.

1Tyrrell_McAllister16y

FWIW, in my experience the useful criticisms happen at conferences or in private conversation, not during the peer review process.

-1MichaelVassar16y

It is rarely the case that experience adds hardly anything. What are your priors and posteriors here? How did you update?

1wedrifid16y

This is the main reasons Eliezer gives as to why he does not bother to create such a proposal. If this does apply to Eliezer as well would a suggestion that he should write such a paper serve any other purpose than one-upmanship?

[-]Risto_Saarelma16y50

One argument against a hard takeoff is Greg Egan's idea about there not being a strong hierarchy of power in general intelligence, which he goes into here, for example. Would this be worth covering?

0Kaj_Sotala16y

Definitely so, at least briefly.

-1timtyler16y

I don't think that's worth very much. We already have systems which are much more powerful in many areas.

[-]AndrewKemendo16y40

This is a problem for both those who'd want to critique the concept, and for those who are more open-minded and would want to learn more about it.

Anyone who is sufficiently technically minded undoubtedly finds frustration in reading books which give broad brush stroked counterfactuals to decision making and explanation without delving into the details of their processes. I am thinking of books like Freakonomics, Paradox of Choice, Outliers, Nudge etc..

These books are very accessible but lack the in depth analysis which are expected to be thoroughly cri... (read more)

4righteousreason16y

I found the two SIAI introductory pages very compelling the first time I read them. This was back before I knew what SIAI or the Singularity really was, as soon as I read through those I just had to find out more.

[-]Zack_M_Davis16y30

Robin criticizes Eliezer for not having written up his arguments about the Singularity in a standard style and submitted them for publication. Others, too, make the same complaint: the arguments involved are covered over such a huge mountain of posts that it's impossible for most outsiders to seriously evaluate them.

Did everyone forget about "Artificial Intelligence as a Positive and Negative Factor in Global Risk"?

9Kaj_Sotala16y

That one's useful, but IMO, as an introduction document it needs a lot of work. For instance, the main thesis should be stated a lot earlier and clearer. As it is, it spends several pages making different points about the nature of AI before saying anything about why AI is worth discussing as a risk in the first place. That comes as late as page 17, and much of that is an analogue to the nuclear bomb. It spends a lot of time arguing against the proposition that a hard takeoff is impossible, but not much time arguing for the proposition that a hard takeoff is likely, which is a major failing if it's supposed to convince people. Mostly the paper suffers from the problem of being too spread out: it doesn't really give strong support to a narrow set of core claims, instead giving weak support to a wide set of core and non-core claims. I have a memory of reading the thing back when I didn't know so much about the Singularity, and thinking it was good, even though it took a long time to get to the point and had stuff that seemed unnecessary. It was only later on when I re-read the paper that I realized how those "unnecessary" parts connected with other pieces of FAI theory - but of course, by that time I wasn't really an outsider anymore.

[-][anonymous]16y20

As a Foom skeptic, what would convince me of taking the concept seriously, is an argument that intelligence/power is a quantity that we reason with in the same way as we reason about the number of neutrons in a nuclear reactor/bomb. Power seems like a slippery ephemeral concept, optimisation power appears to be able evaporate at the drop of a hat (if someone comes to know an opponents source code and can emulate them entirely).

3Johnicholas16y

I am also to some extent skeptical but I can recreate some of the arguments for the possibility of AI-caused explosive change that I've heard. 1. From the timescale of nonliving matter, life seemed explosive. 2. From the timescale of nonhuman life, humans seemed explosive. 3. If we look to Moore's law (or other similar metrics) of the degree of processing power (or of knowledge) "possessed" by human-crafted artifacts, there seems to be an explosive trend. We certainly take the concept of intelligence and intellectual power seriously when we deal with other humans (e.g. employers evaluating potential hires, or in diagnosing mental retardation). To some extent, we expect that a human with computational tools (abacus, pen and paper, calculator, computer) will have increased intellectual capabilities - so we do judge the power of "cyborg" human-computer amalgams.

1timtyler16y

We have explosive change today - if "explosive" intended to mean the type of exponential growth process exhibited in nuclear bombs. Check with Moore's law. If you are looking for an explosion, there is no need for a crystal ball - simply look around you.

0Johnicholas16y

I agree with you - and I think the SIAI focuses too much on possible future computer programs, and neglects the (limited) superintelligences that already exist, various amalgams and cyborgs and group minds coordinated with sonic telepathy. In the future where the world continues (that is, without being paperclipped) and without a singleton, we need to think about how to deal with superintelligences. By "deal with" I'm including prevailing over superintelligences, without throwing up hands and saying "it's smarter than me".

6Eliezer Yudkowsky16y

throws up hands Not every challenge is winnable, you know.

0JamesAndrix16y

Impossible? Are you saying a human can't beat a group mind or are you and Johnicholas using different meanings of superintelligence? Also, what if we're in a FAI without a nonperson predicate?

3CronoDAS16y

You can start practicing by trying to beat your computer at chess. ;)

2Johnicholas16y

I'm pretty good at beating my computer at chess, even though I'm an awful player. I challenge it, and it runs out of time - apparently it can't tell that it's in a competition, or can't press the button on the clock. This might sound like a facetious answer, but I'm serious. One way to defeat something that is stronger than you in a limited domain is to strive to shift the domain to one where you are strong. Operating with objects designed for humans (like physical chess boards and chess clocks) is a domain that current computers are very weak at. There are other techniques too. Consider disease-fighting. The microbes that we fight are vastly more experienced (in number of generations evolved), and the number of different strategies that they try is vastly huge. How is it that we manage to (sometimes) defeat specific diseases? We strive to hamper the enemy's communication and learning capabilities with quarantine techniques, and steal or copy the nanotechnology (antibiotics) necessary to defeat it. These strategies might well be our best techniques against unFriendly manmade nanotechnological infections, if such broke out tomorrow. Bruce Schneier beats people over the head with the notion DON'T DEFEND AGAINST MOVIE PLOTS! The "AI takes over the world" plot is influencing a lot of people's thinking. Unfriendly AGI, despite its potential power, may well have huge blind spots; mind design space is big!

8wedrifid16y

I have not yet watched a movie where humans are casually obliterated by a superior force, be that a GAI or a technologically superior alien species. At least some of the humans always seem to have a fighting chance. The odds are overwhelming of course, but the enemy always has a blind spot that can be exploited. You list some of them here. They are just the kind of thing McKay deploys successfully against advanced nanotechnology. Different shows naturally give the AI different exploitable weaknesses. For the sake of the story such AIs are almost always completely blind to the most of the obvious weaknesses of humanity. The whole 'overcome a superior enemy by playing to your strengths and exploiting their weakness' makes for great viewing but outside of the movies it is far less likely to play a part. The chance of creating an uFAI that is powerful enough to be a threat and launch some kind of attack and yet not be able to wipe out humans is negligible outside of fiction. Chimpanzees do not prevail over a civilisation with nuclear weapons. And no, the fact that they can beat us in unarmed close combat does not matter. They just die.

0Johnicholas16y

Yes, this is movie-plot-ish-thinking in the sense that I'm proposing that superintelligences can be both dangerous and defeatable/controllable/mitigatable. I'm as prone to falling into the standard human fallacies as the next person. However, the notion that "avoid strength, attack weakness" is primarily a movie-plot-ism seems dubious to me. Here is a more concrete prophesy, that I hope will help us communicate better: Humans will perform software experiments trying to harness badly-understood technologies (ecosystems of self-modifying software agents, say). There will be some (epsilon) danger of paperclipping in this process. Humans will take precautions (lots of people have ideas for precautions that we could take). It is rational for them to take precautions, AND the precautions do not completely eliminate the chance of paperclipping, AND it is rational for them to forge ahead with the experiments despite the danger. During these experiments, people will gradually learn how the badly-understood technologies work, and transform them into much safer (and often much more effective) technologies.

1wedrifid16y

That certainly would be dubious. Avoid strength, attack weakness is right behind 'be a whole heap stronger' as far as obvious universal strategies go. If there are ways to make it possible to experiment and make small mistakes and minimise the risk of catastrophe then I am all in favour of using them. Working out which experiments are good ones to do so that people can learn from them and which ones will make everything dead is a non-trivial task that I'm quite glad to leave to someone else. Given that I suspect both caution and courage to lead to an unfortunately high probability of extinction I don't envy them the responsibility. Possibly. You can't make that conclusion without knowing the epsilon in question and the alternatives to such experimentation. But there are times when it is rational to go ahead despite the danger.

0timtyler16y

The fate of most species is extinction. As the first intelligent agents, people can't seriously expect our species to last for very long. Now that we have unleashed user-modifiable genetic materials on the planet, DNA's days are surely numbered. Surely that's a good thing. Today's primitive and backwards biotechnology is a useless tangle of unmaintainable spaghetti code that leaves a trail of slime wherever it goes - who would want to preserve that?

0CronoDAS16y

You didn't see the Hitchhiker's Guide to the Galaxy film? ;)

0wedrifid16y

:) Well, maybe substitute 'some' for 'one' in the next sentence.

1timtyler16y

http://en.wikipedia.org/wiki/Invasion_of_the_Body_Snatchers_(1978_film) ...apparently has everyone getting it fairly quickly at the hands of aliens.

3loqi16y

You know this trick too? You wouldn't believe how many quadriplegics I've beaten at chess this way.

2Vladimir_Nesov16y

You may be right, but I don't think it's a very fruitful idea: what exactly do you propose doing? Also, building of a FAI is a distinct effort from e.g. healing malaria or fighting specific killer robots (with the latter being quite hypothetical, while at least the question of technically understanding FAI seems inevitable). This may be possible if an AGI has a combination of two features: it has significant real-world capabilities that make it dangerous, yet it's insane or incapable enough to not be able to do AGI design. I don't think it's very plausible, since (1) even Nature was able to build us, given enough resources, and it has no mind at all, so it shouldn't be fundamentally difficult to build an AGI (even for an irrational proto-AGI) and (2) we are at the lower threshold of being dangerous to ourselves, yet it seems we are at the brink of building an AGI already. Having an AGI dangerous (extinction risk dangerous), and dangerous exactly because of its intelligence, yet not AGI-building-capable doesn't seem to me unlikely. But maybe possible for some time. Now, consider the argument about humans being at the lowest possible cognitive capability to do much of anything, applied to proto-AGI-designed AGIs. AGI-designed AGIs are unlikely to be exactly as dangerous as the designer AGI, they are more likely to be significantly more or less dangerous, with "less dangerous" not being an interesting category, if both kinds of designs occur over time. This expected danger adds to the danger of the original AGI, however inapt they themselves may be. And at some point, you get to an FAI-theory-capable-AGI that builds something rational, not once failing all the way to the end of times.

1Johnicholas16y

I'd like to continue this conversation, but we're both going to have to be more verbose. Both of us are speaking in very compressed allusive (that is, allusion-heavy) style, and the potential for miscommunication is high. "I don't think it's a very fruitful idea: what exactly do you propose doing?" My notion is that SIAI in general and EY in particular, typically work with a specific "default future" - a world where, due to Moore's law and the advance of technology generally, the difficulty of building a "general-purpose" intelligent computer program drops lower and lower, until one is accidentally or misguidedly created, and the world is destroyed in a span of weeks. I understand that the default future here is intended to be a conservative worst-case possibility, and not a most-probable scenario. However, this scenario ignores the number and power of entities (such as corporations, human-computer teams, and special-purpose computer programs) and which are more intelligent in specific domains than humans. It ignores their danger - human potential flourishing can be harmed by other things than pure software - and it ignores their potential as tools against unFriendly superintelligence. Correcting that default future to something more realistic seems fruitful enough to me. "Technically understanding FAI seems inevitable." What? I don't understand this claim at all. Friendly artificial intelligence, as a theory, need not necessarily be developed before the world is destroyed or significantly harmed. "This may be possible" What is the referent of "this"? Techniques for combating, constraining, controlling, or manipulating unFriendly superintelligence? We already have these techniques. We harness all kinds of things which are not inherently Friendly and turn them to our purposes (rivers, nations, bacterial colonies). Techniques of building Friendly entities will grow directly out of our existing techniques of taming and harnessing the world, including but not limit

0Jordan16y

Good point. Even an Einstein level AI with 100 times the computing power of an average human brain probably wouldn't be able to be beat Deep Blue at chess (at least not easily).

-3[anonymous]16y

Sending Summer Glau back in time, obviously. If you find an unfriendly AGI that can't figure out how to build nanotech or teraform the planet we're saved.

1timtyler16y

A superintelligence can reasonably be expected to proactively track down its "blind spots" and eradicate them - unless it's "blind spots" are very carefully engineered.

1Johnicholas16y

As I understand your argument, you start with an artificial mind, a potential paperclipping danger, and then (for some reason? why does it do this? Remember, it doesn't have evolved motives) it goes through a blind-spot-eradication program. Afterward, all the blind spots remaining would be self-shadowing blind spots. This far, I agree with you. The question of how many remaining blind spots, or how big they are has something to do with the space of possible minds and the dynamics of self-modification. I don't think we know enough about this space/dynamics to conclude that remaining blind spots would have to be carefully engineered.

6wedrifid16y

You have granted a GAI paperclip maximiser. It wants to make paperclips. That's all the motive it needs. Areas of competitive weakness are things that may make it get destroyed by humans. If it is destroyed by humans less paperclips will be made. It will eliminate its weaknesses with high priority. It will quite possibly eliminate all the plausible vulnerabilities and also the entire human species before it makes a single paperclip. That's just good paperclip maximising sense.

4Johnicholas16y

As I understand your thought process (and Steve Omohundro's), you start by saying "it wants to make paperclips", and then, in order to predict its actions, you recursively ask yourself "what would I do in order to make paperclips?". However, this recursion will inject a huge dose of human-mind-ish-ness. It is not at all clear to me that "has goals" or "has desires" is a common or natural feature of mind space. When we study powerful optimization processes - notably, evolution, but also annealing and very large human organizations - we generally can model some aspects of their behavior as goals or desires, but always with huge caveats. The overall impression that we get of these processes, considered as minds, is that they're insane. Insane is not the same as stupid, and it's not the same as safe.

3Douglas_Knight16y

No, goals are not universal, but it seems likely that the vN-M axioms have a pretty big basin of attraction in mind-space, that a lot of minds will become convinced that sanity is following them, causing them to pick up a utility function, which will probably not capture everything we value and could easily be as simple or as irrelevant to what we value as counting paperclips or smiles.

1Johnicholas16y

I think you're still injecting human-mind-ish-ness. Let me try to stretch your conception of "mind". The ocean "wants" to increase efficiency of heat transfer from the equator to the poles. It applies a process akin to simulated annealing with titanic processing power. Has it considered the von Neumann-Morganstern axioms? Is it sane? Is it safe? Is it harnessable? A colony of microorganisms "wants" to survive and reproduce. In an environment with finite resources (like a wine barrel) is it likely to kill itself off? Is that sane? Are colonies of microorganisms safe? Are they harnessable? A computer program that grows out of control could be more like the ocean optimizing heat transfer, or a colony of microorganisms "trying" to survive and reproduce. The von Neumann-Morganstern axioms are intensely connected to human notions of math, philosophy and happiness. I think predicting that they're attractors in mind-space is exactly as implausible as predicting that the Golden Rule is an attractor in mind-space.

3wedrifid16y

It could. But it wouldn't be an AGI. They could still become 'grey goo' though, which is a different existential threat and yes, it is one where your 'find their weakness' thing is right on the mark. Are we even talking about the same topic here?

1Johnicholas16y

The topic as I understand it is how the "default future" espoused by SIAI and EY focuses too much on things that look something like HAL or Prime Intellect (and their risks and benefits), and not enough on entities that display super-human capacities in only some arenas (and their risks and benefits). In particular, an entity that is powerful in some ways and weak in other ways could reduce existential risks without becoming an existential risk.

-1timtyler16y

That seems to be switching context. I was originally talking about a "superintelligence", The ocean and grey goo would clearly not qualify. FWIW, expected utility theory is a pretty general economic idea that nicely covers any goal-seeking agent.

-1timtyler16y

That sounds like the SIAI party line :-( Machine intelligence will likely have an extended genesis at the hands of humanity - and during its symbiosis with us, there will be a lot of time for us to imprint our values on it. Indeed, some would say this process has already started. Governments are likely to become superintelligent agents in the future - and they already have detailed and elaborate codifications of the things that many humans value negatively - in the form of their legal systems.

0timtyler16y

Evolution apparently has an associated optimisation target. See my: http://originoflife.net/direction/ http://originoflife.net/gods_utility_function/ Others have written on this as well - e.g. Robert Wright, Richard Dawkins, John Stewart, Evolution is rather short-sighted - and only has the lookahead capabilities that organisms have (though these appear to be improving with time). So: whether the target can be described as being a "goal" is debatable. However, we weren't talking about evolution, we were talking about superintelligences. Those are likely to be highly goal-directed.

-5Johnicholas16y

-8Clippy16y

0timtyler16y

This is because of the natural drives that we can reasonably expect many intelligent agents to exhibit - see: http://selfawaresystems.com/2007/11/30/paper-on-the-basic-ai-drives/ http://selfawaresystems.com/2009/02/18/agi-08-talk-the-basic-ai-drives/

0RobinZ16y

A helpful tip: early computer chess programs were very bad at "doing nothing, and doing it well". I believe they are bad at Go for similar reasons.

-4timtyler16y

Companies and governments are probably the most likely organisations to play host superintelligent machines in the future. They are around today - and I think many of the efforts on machine morality would be better spent on making those organisations more human-friendly. "Don't be evil" is just not good enough! I think reputation systems would help a lot. See my "Universal karma" video: http://www.youtube.com/watch?v=tfArOZVKCCw

1DanArmak16y

Because all except (so it is claimed) one company and all governments bar none do not even pretend to embrace this attitude. OK, so my government gets low karma. So what? How does that stop them for doing whatever they want to for years to come? If you suggest it would cause members of parliament to vote no confidence and cause early elections - something which only applies in a many-parties democracy - then I suggest that in such a situation no government could remain stable for long. There'd be a new cause celebre every other week, and a government only has to lose one vote to fall. And if the public, through karma, could force government to act in certain ways without going through elections, then we'd have direct democracy with absolute-majority-rule. A system that's even worse than what we have today.

0timtyler16y

Of course, governments and companies have reputations today. There are few enough countries that people can keep track of their reputations reasonably easily when it comes to trade, travel and changing citizenship. It is probably companies where reputations are needed the most. You can search - and there's resources like: http://www.dmoz.org/Society/Issues/Business/Allegedly_Unethical_Firms/ ...but society needs more.

0[anonymous]16y

I'm really looking for a justification of the nuclear reactor metaphor for intelligence amplifying intelligence, on the software level. AI might explode sure, but exponential intelligence amplification on the software level pretty much guarantees it on the first AI rather than us having to wait around and possibly merge before the explosion.

0Johnicholas16y

As I understand it, the nuclear reactor metaphor is simply another way of saying "explosive" or "trending at least exponentially". Note that most (admittedly fictional) descriptions of intelligence explosion include "bootstrapping" improved hardware (e.g. Greg Egan's Crystal Nights)

-4timtyler16y

Intelligence is building on itself today. That's why we see the progress we do. See: http://en.wikipedia.org/wiki/Intelligence_augmentation If you want to see an hardware explosion, look to Moore's law. For a software explosion, the number of lines of code being written is reputedly doubling even faster than every 18 months.

1[anonymous]16y

I want to see an explosion in the efficacy of software, not simply the amount that is written.

1timtyler16y

Software is gradually getting better. If you want to see how fast machine intelligence software is progressing, one reasonably-well measured area is chess/go ratings.

1DanArmak16y

How can progress in such a narrow problem be representative of the efficacy of software either in some general sense or versus other narrow problems? Also: what is the improvement over time of machine chess playing ability due to software changes once you subtract hardware improvements? I remember seeing vague claims that chess performance over the decades stayed fairly true to Moore's Law, i.e. scaled with hardware. As a lower bound this is entirely unsurprising, since naive chess implementations (walk game tree to depth X) scale easily with both core speed and number of cores.

-1Thomas16y

The computation is not a "pure process", it has its physical side and this may be used as a matter transformer by some software. There from a FOOM might appear. I asked that Eliezer Yudkowsky in the questions for him, but nearly nobody noticed by vote it up.

0wedrifid16y

Hang on, I voted this up because it was a good point but on second glance it isn't a point that is at all relevant to what whpearson is asking.

[-]soreff16y10

Any thoughts on what the impact of the http://www.research.ibm.com/deepqa/ IBM Watson Deepqa project would be on a Foom timescale, if it is successful (in the sense of approximate parity with human competitors)? My impression was that classical AI failed primarily because of brittle closed-world approximations, and this project looks like it (if successful) would largely overcome those obstacles. For instance, it seems like one could integrate a deepqa engine with planning and optimization engines in a fairly straightforward way. To put it another wa... (read more)

[-]alyssavance16y10

Hmmm... Maybe you could base some of it off of Eliezer's Future Salon talk (http://singinst.org/upload/futuresalon.pdf)? That's only about 11K words (sans references), while his book chapters are ~40K words and his OB/LW posts are hundreds of thousands of words.

[-]timtyler16y10

Relevant article by E.Y.:

http://lesswrong.com/lw/we/recursive_selfimprovement/

[-][anonymous]16y00

Any thoughts on what the impact of the IBM Watson Deepqa project would be on a Foom timescale, if it is successful (in the sense of approximate parity with human competitors. My impression was that classical AI failed primarily because of brittle closed-world approximations, and this project looks like it (if successful) would largely overcome those obstacles. For instance, it seems like one could integrate a deepqa engine with planning and optimization engines in a fairly straightforward way. To put it another way, in the form of an idea futures propo... (read more)

[-]JoshuaFox16y00

Kaj, great idea

in a standard style and submitted them for publication

This will be one of the greater challenges; we know the argument and how to write well, but each academic discipline has rigid rules for style in publications. This particular journal, with its wide scope, may be a bit more tolerant, but in general learning the style is important if one wants to influence academia.

I imagine that one will have to go beyond the crowdsourcing approach in achieving this.

If you are coordinating, let me know if and how I can help.

1Kaj_Sotala16y

I was mostly thinking that people would jointly write a draft article, which the more experienced writers would then edit to conform with a more academic style, if necessary.

[-]StefanPernar16y00

More recent criticism comes from Mike Treder - managing director of the Institute for Ethics and Emerging Technologies in his article "Fearing the Wrong Monsters" => http://ieet.org/index.php/IEET/more/treder20091031/

[-]timtyler16y00

Aubrey argues for "the singularity" here:

"The singularity and the Methuselarity: similarities and differences" - by Aubrey de Grey

http://www.sens.org/files/sens/FHTI07-deGrey.pdf

He uses the argument from personal incredulity though - one of the weakest forms of argument known.

He says:

"But wait – who’s to say that progress will remain “only” exponential? Might not progress exceed this rate, following an inverse polynomial curve (like gravity) or even an inverse exponential curve? I, for one, don’t see why it shouldn’t. If we conside... (read more)

1Tyrrell_McAllister16y

I haven't read the whole essay, but the portion that you quoted isn't an argument from incredulity. An argument from incredulity has the form "Since I can't think of an argument for assigning P low probability, I should assign P high probability.". Aubrey's argument has the form "Since I can't think of an argument for assigning P low probability, I shouldn't assign P low probability.".

1timtyler16y

He's expressing incredulity - and then arguing from that. He goes on to assume that this "singularity" thing happens for much of the rest of the paper.

[-]StefanPernar16y-10

Very constructive proposal Kaj. But...

Since it appears (do correct me if I'm wrong!) that Eliezer doesn't currently consider it worth the time and effort to do this, why not enlist the LW community in summarizing his arguments the best we can and submit them somewhere once we're done?

If Eliezer does not find it a worthwhile investment of his time - why should we?

7Eliezer Yudkowsky16y

In general: Because my time can be used to do other things which your time cannot be used to do; we are not fungible. (As of this comment being typed, I'm working on a rationality book. This is not something that anyone else can do for me.)

-1StefanPernar16y

This statement is based on three assumptions: 1) What you are doing instead is in fact more worthy of your attention than your contribution here 2) I could not do what you are doing as least as well as you 3) I do not have other things to do that are at least as worthy of my time None of those three I am personally willing to grant at this point. But surely that is not the case for all the others around here.

2kurige16y

1) You can summarize arguments voiced by EY. 2) You cannot write a book that will be published under EY's name. 3) Writing a book takes a great deal of time and effort. You're reading into connotation a bit too much.

0StefanPernar16y

Its called ghost writing :-) but then again the true value add lies in the work and not in the identity of the author. (discarding marketing value in the case of celebrities) I do not think so - am just being German :-) about it: very precise and thorough.

0Kaj_Sotala16y

I think his arguments are worthwhile and important enough to be heard by an audience that is as wide as possible, regardless of whether or not he feels like writing up the arguments in an easily digestible form.

[-]righteousreason16y-30

==Re comments on "Singularity Paper"== Re comments, I had been given to understand that the point of the page was to summarize and cite Eliezer's arguments for the audience of ''Minds and Machines''. Do you think this was just a bad idea from the start? (That's a serious question; it might very well be.) Or do you think the endeavor is a good one, but the writing on the page is just lame? --User:Zack M. Davis 20:19, 21 November 2009 (UTC)

(this is about my opinion on the writing in the wiki page)

No, just use his writing as much as possible- direct... (read more)

Moderation Log