Eliezer_Yudkowsky comments on Two questions about CEV that worry me - Less Wrong

29 Post author: cousin_it 23 December 2010 03:58PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (137)

You are viewing a single comment's thread.

Comment author: Eliezer_Yudkowsky 23 December 2010 09:32:17PM -1 points [-]

2) How can anyone sincerely want to build an AI that fulfills anything except their own current, personal volition?

I am honestly not sure what to say to people who ask this question with genuine incredulity, besides (1) "Don't be evil" and (2) "If you think clever arguments exist that would just compel me to be evil, see rule 1."

Comment author: cousin_it 23 December 2010 11:22:30PM *  22 points [-]

I don't understand your answer. Let's try again. If "something like CEV" is what you want to implement, then an AI pointed at your volition will derive and implement CEV, so you don't need to specify it in detail beforehand. If CEV isn't what you want to implement, then why are you implementing it? Assume all your altruistic considerations, etc., are already folded into the definition of "you want" - just like a whole lot of other stuff-to-be-inferred is folded into the definition of CEV.

ETA: your "don't be evil" looks like a confusion of levels to me. If you don't want to be evil, there's already a term for that in your volition - no need to add any extra precautions.

Comment author: wedrifid 24 December 2010 02:47:47AM *  11 points [-]

If CEV isn't what you want to implement, then why are you implementing it?

The sane answer is that it solves a cooperation problem. ie. People will not kill you for trying it and may instead donate money. As we can see here this is not the position that Eliezer seems to take. He goes for the 'signal naive morality via incomprehension' approach.

Comment author: XiXiDu 24 December 2010 01:05:10PM 4 points [-]

People will not kill you for trying it and may instead donate money.

I do not think this would work. Take the viewpoint of a government. What does CEV do? It does deprive them of some amount of ultimate power. The only chance I see to implement CEV using an AI going FOOM is either secretly or due to the fact that nobody takes you serious enough. Both routes are rather unlikely. Military analysis of LW seems to be happening right now. And if no huge unforeseeable step towards AGI happens, it will move forward gradually enough for governments (or other groups), who already investigate LW and the SIAI, to notice and take measures to disable anyone trying to implement CEV.

The problem is that once CEV becomes feasible, governments will consider anyone working on it as an attempted coup. Regardless of the fact that the people involved might not perceive it to be politics, working on a CEV is indeed an highly political activity. At least this will be the viewpoint of many who do not understand CEV or oppose it for different reasons.

Comment author: timtyler 28 December 2010 04:26:36PM *  -2 points [-]

That is more-or-less my own analysis. Notoriously:

Politics is the gentle art of getting votes from the poor and campaign funds from the rich by promising to protect each from the other.

CEV may get some the votes from the poor - but offers precious little to the rich. Since those are the folk who are running the whole show, it is hard to see how they will approve it. They won't approve it - there isn't anything in it for them. So, I figure, the plan is probably pretty screwed - the hopeful plan of a bunch of criminal (their machine has no respect for the law!) and terrorist (if they can make it stick!) outlaws - who dream of overthrowing their own government.

Comment author: wedrifid 24 December 2010 01:31:37PM *  0 points [-]

I do not think this would work.

Pardon me. To be more technically precise: "Implementing an AI that extrapolates the volition of something other or broader than yourself may facilitate cooperation. It would reduce the chance that people will kill you for the attempt and increase the chance of receiving support."

Comment author: XiXiDu 24 December 2010 02:55:55PM *  5 points [-]

Aha, I see. My mistake, ignoring the larger context.

Seen this? Anyway, I feel that it is really hard to tackle this topic because of its vagueness. As multifoliaterose implied here, at the moment the task to recognize humans as distinguished beings already seems to me too broad a problem to tackle directly. Talking about implementing CEV indirectly, by derivation from Yudkowsky's mind, versus specifying the details beforehand, seems to be fun to argue but ultimately ineffective at this point. In other words, an organisation that claims to solve some meta problem by means of CEV is only slightly different from one proclaiming to make use of magic. I'd be much more comfortable to donate to a decision theory workshop for example.

I digress, but I thought I should clarify some of my intention for always getting into discussions involving the SIAI. It is highly interesting, sociological I suppose. On the one hand people take this topic very serious, the most important topic indeed, yet seem to be very relaxed about the only organisation involved in shaping the universe. There is simply no talk about more transparency to prove the effectiveness of the SIAI and its objectives. Further, without transparency you simply cannot conclude that because someone writes a lot of ethical correct articles and papers that that output is reflective of their true goals. Also people don't seem to be worried very much about all the vagueness involved here, as this post proves once again. Where is the progress that would justify further donations? As I said, I digress. Excuse me but this topic is the most fascinating issue for me on LW.

Back to your comment, it makes sense. Surely if you tell people to also take care of what they want, they'll be less opposed than if you told them that you'll just do what you want because you want to make them happy. Yet there will be those who don't want you to do it, regardless of wanting to make them happy. There will be those who only want you to implement their personal volition. So whenever CEV will be taken serious it will become really hard to implement it, because people will get mad about it, really mad. People already oppose small-impact policies just because it's the other party that is trying to implement them. What will they do if one person or organisation tries to implement a policy for the whole universe and the rest of infinity?

Comment author: paulfchristiano 28 December 2010 11:30:55PM 4 points [-]

There is simply no demand for more transparency to prove the effectiveness of the SIAI and its objectives.

Are you sure? I imagine there are many people interested in evaluating the effectiveness of the SIAI. At least I am, and from the small number of real discussions I have had about the SIAI's project I extrapolate that uncertainty is the main inhibitor of enthusiasm (although of course if the uncertainty was removed this may create more fundamental problems).

Comment author: TheOtherDave 28 December 2010 11:48:31PM 4 points [-]

The counterargument I've read in earlier ("unreal") discussions on the subject is, roughly, that people who claim their support for SIAI is contingent on additional facts, analyses, or whatever are simply wrong... that whatever additional data is provided along those lines won't actually convince them, it will merely cause them to ask for different data.

Comment author: Vaniver 29 December 2010 09:51:52AM 2 points [-]

This strikes me as a difficult thing to know, and the motives that lead to assuming it are not particularly pleasant.

Comment author: TheOtherDave 29 December 2010 01:09:18PM 0 points [-]

While the unpleasant readings are certainly readily available, more neutral readings are available as well.

By way of analogy: it's a common relationship trope that suitors who insist on proof of my love and fidelity won't be satisfied with any proofs I can provide. OTOH, it's also a common trope that suitors who insist that I should trust in their love and fidelity without evidence don't have them to offer in the first place.

If people who ask me a certain type of question aren't satisfied with the answer I have, I can either look for different answers or for different people; which strategy I pick depends on the specifics of the situation. If I want to infer something about someone else based on their choice of strategy I similarly have to look into the specifics of the situation. IME there is no royal road to the right answer here.

Comment author: wedrifid 29 December 2010 11:33:01AM 0 points [-]

This strikes me as a difficult thing to know,

It strikes me as a tendency that can either be observed as a trend or noted to be absent.

and the motives that lead to assuming it are not particularly pleasant.

This strikes meas a difficult thing to know. And distastefully ironic.

Comment author: Nick_Tarleton 29 December 2010 05:40:13PM 1 point [-]

I assume you're referring to Is That Your True Rejection?.

Comment author: TheOtherDave 29 December 2010 08:16:53PM 0 points [-]

(nods) I think so, yes.

Comment author: XiXiDu 29 December 2010 09:22:58AM *  2 points [-]

Please read this comment. It further explains why I actually believe that transparency is important to prove the effectiveness of the SIAI. I also edited my comment above. I seem to have messed up on correcting some grammatical mistakes. It originally said, there is simply no talk about more transparency....

Comment author: timtyler 28 December 2010 10:36:43PM 0 points [-]

On the one hand people take this topic very serious, the most important topic indeed, yet seem to be very relaxed about the only organisation involved in shaping the universe.

"The only organisation involved in shaping the universe"?!? WTF? These folks have precious little in terms of resources. They apparently haven't even started coding yet. You yourself assign them a miniscule chance of succeeding at their project. How could they possibly be the "the only organisation involved in shaping the universe"?!?

Comment author: paulfchristiano 29 December 2010 12:49:25AM 4 points [-]

They apparently haven't even started coding yet.

Really? Even if they were working on a merely difficult problem, you would expect coding to be the very last step of the project. People don't solve hard algorithmic problems by writing some code and seeing what happens. I wouldn't expect an organization working optimally on AGI to write any code until after making some remarkable progress on the problem.

How could they possibly be the "the only organisation involved in shaping the universe"?!?

There could easily be no organization at all trying to deliberately control the long-term future of the human race; we'd just get whatever we happened to stumble into. You are certainly correct that there are many, many organizations which are involved in shaping our future; they just rarely think about the really long-term effects (I think this is what XiXiDu meant).

Comment author: timtyler 29 December 2010 01:05:42AM *  0 points [-]

Really? Even if they were working on a merely difficult problem, you would expect coding to be the very last step of the project. People don't solve hard algorithmic problems by writing some code and seeing what happens. I wouldn't expect an organization working optimally on AGI to write any code until after making some remarkable progress on the problem.

IMO, there's a pretty good chance of an existing organisation being involved with getting there first. The main problem with not having any working products is that it is challenging to accumulate resources - which are needed to hire researchers and programmers - which you need to fuel your self-improvement cycle.

Google, hedge funds, and security agencies have their self-improvement cycle already rolling - they are evidently getting better and better as time passes. That results in accumulated resources, which can be used to drive further development.

If you were a search company who aimed directly at a human-level search agent, you are now up against a gorilla with an android army who already has most of the pieces of the puzzle. Waiting until you have done all the relevant R+D is just not how software development works. You get up and running as fast as you can - or else someone else does that first - and eats your lunch.

Comment author: timtyler 29 December 2010 10:41:33AM *  0 points [-]

So whenever CEV will be taken serious it will become really hard to implement it, because people will get mad about it, really mad. People already oppose small-impact policies just because it's the other party that is trying to implement them. What will they do if one person or organisation tries to implement a policy for the whole universe and the rest of infinity?

Right - but this seems as though it isn't how things are likely to go down. CEV is a pie-in-the-sky wishlist - not an engineering proposal. Those attempting to directly implement things like it seem practically guaranteed to get to the plate last. For example Ben's related proposal involved "non-invasive" scanning of the human brain. That just isn't technology we will get before we have sophisticated machine intelligence, I figure. So: either the proposals will be adjusted so they are more practical en route - or else, the proponents will just fail.

Most likely there will be an extended stage where people tell the machines what to do - much as Asimov suggested. The machines will "extrapolate" in much the same way that Google Instant "extrapolates" - and the human wishes will "cohere" - to the extent that large-scale measures in society encourage cooperation.

Comment author: timtyler 28 December 2010 10:45:23PM *  0 points [-]

There is simply no demand for more transparency to prove the effectiveness of the SIAI and its objectives.

FWIW, I mostly gave up on them a while back. As a spectator, I mostly look on, grimacing, while wondering whether there are any salvage opportunities.

Comment author: XiXiDu 29 December 2010 09:19:22AM *  0 points [-]

Here is the original comment. It wasn't my intention to say that, it originally said there is simply no talk about more transparency.... I must have messed up on correcting some mistakes.

Comment author: timtyler 29 December 2010 10:10:08AM 0 points [-]

I just copied-and-pasted verbatim. However the current edit does seem to make more sense.

Comment author: cousin_it 24 December 2010 09:13:13AM *  2 points [-]

Awesome comment, thanks. I'm going to think wishfully and take that as SIAI's answer.

Comment author: timtyler 28 December 2010 11:38:00PM *  -1 points [-]

The sane answer is that it solves a cooperation problem.

Reciprocal altruism sometimes sends a relatively weak signal - it says that you will cooperate so long as the "shadow of the future" is not too ominous.

Invoking "good" and "evil" signals more that you believe in moral absolutes: the forces of good and evil.

On the one hand, that is a stronger signalling technique - it attempts to signal that you won't defect - no matter what!

On the other hand, it makes you look a bit as though you are crazy, don't understand rationality or game theory - and this can make your behaviour harder to model.

As with most signalling, it should be costly to be credible. Alas, practically anyone can rattle on about good and evil. I am not convinced it is very effective overall.

Comment author: Tyrrell_McAllister 24 December 2010 04:53:17AM *  -1 points [-]

ETA: your "don't be evil" looks like a confusion of levels to me. If you don't want to be evil, there's already a term for that in your volition - no need to add any extra precautions.

Eliezer didn't realize that you meant his own personal CEV, rather than his current incoherent, unextrapolated volition.

Comment author: TheOtherDave 23 December 2010 10:32:59PM *  6 points [-]

One thing you could say that might help is if you were clearer about when you consider it evil to ignore the volition of an intelligence, since it's clear from your writing that sometimes you don't.

For example, "don't be evil" clearly isn't enough of an argument to convince you to build an AI that fulfills Babykiller or Pebblesorter or SHFP volition, for example, should we encounter any... although at least some of those would indisputably be intelligences.

Given that, it might reassure people to explicitly clarify why "don't be evil" is enough of an argument to convince you to build an AI that fulfills the volition of all humans, rather than (let's say) the most easily-jointly-satisfied 98% of humanity, or some other threshold for inclusion.

If this has already been explained somewhere, a pointer would be handy. I have not read the whole site, but thus far everything I've seen to this effect seems to boil down to assuming that there exists a single volition V such that each individual human would prefer V upon reflection to every other possible option, or at least a volition that approximates that state well enough that we can ignore the dis-satisfied minority.

If that assumption is true, the answer to the question you quote is "Because they'd prefer the results of doing so," and evil doesn't enter into it.

If that assumption is false, I'm not sure how "don't be evil" helps.

Comment author: DanArmak 23 December 2010 09:47:50PM 13 points [-]

You have a personal definition for evil, like everyone else. Many people have definitions of good that include things you see as evil; some of your goals are in conflict. Taking that into account, how can you precommit to implementing the CEV of the whole of humanity when you don't even know for sure what that CEV will evaluate to?

To put this another way: why not extrapolate from you, and maybe from a small group of diverse individuals whom you trust, to get the group's CEV? Why take the CEV of all humanity? Inasmuch as these two CEVs differ, why would you not prefer your own CEV, since it more closely reflects your personal definitions of good and evil?

I don't see how this can be consistent unless you start out with "implementing humanity's CEV" as a toplevel goal, and any divergence from that is slightly evil.