Followup to: Anthropomorphic Optimism, The Hidden Complexity of Wishes

Yesterday, I reprised in further detail The Tragedy of Group Selectionism, in which early biologists believed that predators would voluntarily restrain their breeding to avoid exhausting the prey population; the given excuse was "group selection".  Not only does it turn out to be nearly impossible for group selection to overcome a countervailing individual advantage; but when these nigh-impossible conditions were created in the laboratory - group selection for low-population groups - the actual result was not restraint in breeding, but, of course, cannibalism, especially of immature females.

I've made even sillier mistakes, by the way - though about AI, not evolutionary biology.  And the thing that strikes me, looking over these cases of anthropomorphism, is the extent to which you are screwed as soon as you let anthropomorphism suggest ideas to examine.

In large hypothesis spaces, the vast majority of the cognitive labor goes into noticing the true hypothesis.  By the time you have enough evidence to consider the correct theory as one of just a few plausible alternatives - to represent the correct theory in your mind - you're practically done.  Of this I have spoken several times before.

And by the same token, my experience suggests that as soon as you let anthropomorphism promote a hypothesis to your attention, so that you start wondering if that particular hypothesis might be true, you've already committed most of the mistake.

The group selectionists did not deliberately extend credit to the belief that evolution would do the aesthetic thing, the nice thing.  The group selectionists were doomed when they let their aesthetic sense make a suggestion - when they let it promote a hypothesis to the level of deliberate consideration.

It's not like I knew the original group selectionists.  But I've made analogous mistakes as a teenager, and then watched others make the mistake many times over.  So I do have some experience whereof I speak, when I speak of instant doom.

Unfortunately, the prophylactic against this mistake, is not a recognized technique of Traditional Rationality.

In Traditional Rationality, you can get your ideas from anywhere.  Then you weigh up the evidence for and against them, searching for arguments on both sides.  If the question hasn't been definitely settled by experiment, you should try to do an experiment to test your opinion, and dutifully accept the result.

"Sorry, you're not allowed to suggest ideas using that method" is not something you hear, under Traditional Rationality.

But it is a fact of life, an experimental result of cognitive psychology, that when people have an idea from any source, they tend to search for support rather than contradiction - even in the absence of emotional commitment (see link).

It is a fact of life that priming and contamination occur: just being briefly exposed to completely uninformative, known false, or totally irrelevant "information" can exert significant influence on subjects' estimates and decisions.  This happens on a level below deliberate awareness, and that's going to be pretty hard to beat on problems where anthropomorphism is bound to rush in and make suggestions - but at least you can avoid deliberately making it worse.

It is a fact of life that we change our minds less often than we think.  Once an idea gets into our heads, it is harder to get it out than we think.  Only an extremely restrictive chain of reasoning, that definitely prohibited most possibilities from consideration, would be sufficient to undo this damage - to root an idea out of your head once it lodges.  The less you know for sure, the easier it is to become contaminated - weak domain knowledge increases contamination effects.

It is a fact of life that we are far more likely to stop searching for further alternatives at a point when we have a conclusion we like, than when we have a conclusion we dislike.

It is a fact of life that we hold ideas we would like to believe, to a lower standard of proof than ideas we would like to disbelieve.  In the former case we ask "Am I allowed to believe it?" and in the latter case ask "Am I forced to believe it?"  If your domain knowledge is weak, you will not know enough for your own knowledge to grab you by the throat and tell you "You're wrong!  That can't possibly be true!"  You will find that you are allowed to believe it.  You will search for plausible-sounding scenarios where your belief is true.  If the search space of possibilities is large, you will almost certainly find some "winners" - your domain knowledge being too weak to definitely prohibit those scenarios.

It is a fact of history that the group selectionists failed to relinquish their folly.  They found what they thought was a perfectly plausible way that evolution (evolution!) could end up producing foxes who voluntarily avoided reproductive opportunities(!).  And the group selectionists did in fact cling to that hypothesis.  That's what happens in real life!  Be warned!

To beat anthropomorphism you have to be scared of letting anthropomorphism make suggestions.  You have to try to avoid being contaminated by anthropomorphism (to the best extent you can).

As soon as you let anthropomorphism generate the idea and ask, "Could it be true?" then your brain has already swapped out of forward-extrapolation mode and into backward-rationalization mode.  Traditional Rationality contains inadequate warnings against this, IMO.  See in particular the post where I argue against the Traditional interpretation of Devil's Advocacy.

Yes, there are occasions when you want to perform abductive inference, such as when you have evidence that something is true and you are asking how it could be true.  We call that "Bayesian updating", in fact.  An occasion where you don't have any evidence but your brain has made a cute little anthropomorphic suggestion, is not a time to start wondering how it could be true.  Especially if the search space of possibilities is large, and your domain knowledge is too weak to prohibit plausible-sounding scenarios.  Then your prediction ends up being determined by anthropomorphism.  If the real process is not controlled by a brain similar to yours, this is not a good thing for your predictive accuracy.

This is a war I wage primarily on the battleground of Unfriendly AI, but it seems to me that many of the conclusions apply to optimism in general.

How did the idea first come to you, that the subprime meltdown wouldn't decrease the value of your investment in Danish deuterium derivatives?  Were you just thinking neutrally about the course of financial events, trying to extrapolate some of the many different ways that one financial billiard ball could ricochet off another?  Even this method tends to be subject to optimism; if we know which way we want each step to go, we tend to visualize it going that way.  But better that, than starting with a pure hope - an outcome generated because it ranked high in your preference ordering - and then permitting your mind to invent plausible-sounding reasons it might happen.  This is just rushing to failure.

And to spell out the application to Unfriendly AI:  You've got various people insisting that an arbitrary mind, including an expected paperclip maximizer, would do various nice things or obey various comforting conditions:  "Keep humans around, because diversity is important to creativity, and the humans will provide a different point of view."  Now you might want to seriously ask if, even granting that premise, you'd be kept in a nice house with air conditioning; or kept in a tiny cell with life support tubes and regular electric shocks if you didn't generate enough interesting ideas that day (and of course you wouldn't be allowed to die); or uploaded to a very small computer somewhere, and restarted every couple of years.  No, let me guess, you'll be more productive if you're happy.  So it's clear why you want that to be the argument; but unlike you, the paperclip maximizer is not frantically searching for a reason not to torture you.

Sorry, the whole scenario is still around as unlikely as your carefully picking up ants on the sidewalk, rather than stepping on them, and keeping them in a happy ant colony for the sole express purpose of suggesting blog comments.  There are reasons in my goal system to keep sentient beings alive, even if they aren't "useful" at the moment.  But from the perspective of a Bayesian superintelligence whose only terminal value is paperclips, it is not an optimal use of matter and energy toward the instrumental value of producing diverse and creative ideas for making paperclips, to keep around six billion highly similar human brains.  Unlike you, the paperclip maximizer doesn't start out knowing it wants that to be the conclusion.

Your brain starts out knowing that it wants humanity to live, and so it starts trying to come up with arguments for why that is a perfectly reasonable thing for a paperclip maximizer to do.  But the paperclip maximizer itself would not start from the conclusion that it wanted humanity to live, and reason backward.  It would just try to make paperclips.  It wouldn't stop, the way your own mind tends to stop, if it did find one argument for keeping humans alive; instead it would go on searching for an even superior alternative, some way to use the same resources to greater effect.  Maybe you just want to keep 20 humans and randomly perturb their brain states a lot.

If you can't blind your eyes to human goals and just think about the paperclips, you can't understand what the goal of making paperclips implies.  It's like expecting kind and merciful results from natural selection, which lets old elephants starve to death when they run out of teeth.

A priori, if you want a nice result that takes 10 bits to specify, then a priori you should expect a 1/1024 probability of finding that some unrelated process generates that nice result.  And a genuinely nice outcome in a large outcome space takes a lot more information than the English word "nice", because what we consider a good outcome has many components of value.  It's extremely suspicious if you start out with a nice result in mind, search for a plausible reason that a not-inherently-nice process would generate it, and, by golly, find an amazing clever argument.

And the more complexity you add to your requirements - humans not only have to survive, but have to survive under what we would consider good living conditions, etc. - the less you should expect, a priori, a non-nice process to generate it.  The less you should expect to, amazingly, find a genuine valid reason why the non-nice process happens to do what you want.  And the more suspicious you should be, if you find a clever-sounding argument why this should be the case.  To expect this to happen with non-trivial probability is pulling information from nowhere; a blind arrow is hitting the center of a small target.  Are you sure it's wise to even search for such possibilities?  Your chance of deceiving yourself is far greater than the a priori chance of a good outcome, especially if your domain knowledge is too weak to definitely rule out possibilities.

No more than you can guess a lottery ticket, should you expect a process not shaped by human niceness, to produce nice results in a large outcome space.  You may not know the domain very well, but you can understand that, a priori, "nice" results require specific complexity to happen for no reason, and complex specific miracles are rare.

I wish I could tell people:  "Stop!  Stop right there!  You defeated yourself the moment you knew what you wanted!  You need to throw away your thoughts and start over with a neutral forward extrapolation, not seeking any particular outcome."  But the inferential distance is too great; and then begins the slog of, "I don't see why that couldn't happen" and "I don't think you've proven my idea is wrong."

It's Unfriendly superintelligence that tends to worry me most, of course.  But I do think the point generalizes to quite a lot of optimism.  You may know what you want, but Nature doesn't care.

New to LessWrong?

New Comment
78 comments, sorted by Click to highlight new comments since: Today at 12:52 PM
Some comments are truncated due to high volume. (⌘F to expand all)Change truncation settings

In my darker moments, I think that every human political tendency is just an instance of this very problem. Terry Pratchett (our most underrated explicator and critic of both Traditional Rationality and Bourgeois Morality) described it most pithily as "Wouldn't It Be Nice, If Everyone Was Nice". It's most obvious on the left, but can be seen on the right and libertarian tendencies as well.

Caledonian,

There was selection at the level of the group, according to the standard definitions. If you're going to make confrontational non-substantive comments on every post, please at least try to know what you're talking about.

"Sorry, you're not allowed to suggest ideas using that method" is not something you hear, under Traditional Rationality.

But it is a fact of life, ....

It is a fact of life that ....

I disagree. You list a whole collection of mistakes people make after they have a bad hypothesis that they're attached to. I say, the mistake should not be to use your prior experience when you come up with hypotheses. The mistakes are first to get too attached to one hypothesis, followed by the list of "facts of life" mistakes you then described.

People will ... (read more)

To get this whole line of reasoning off the ground, you need a decent way to rank phenomena in terms of how similar they are to us. Given this ranking, the warning is to beware of treating low ranked items like high ranked items. On AI, you need to give an argument why AI is a low ranked item, i.e., why AI is especially unlike us.

Now that you mention it, I'm actually not 100% sure that a paperclip maximizer wouldn't give humans some fraction of computing resources as some sort of very cheap game theory move.

Oh, and my ants say they're offended.

I guess "game theory move" doesn't make much sense; it should have read "given the possibility that it's being simulated".

Hardly any phenomena are like us, though. You can't hold a conversation with gentrification, or teach nitrogen narcosis to play piano.

It strikes me that if you want to rank phenomena as to how like us they are, you have a bunch of humans with gigantic numbers, and then chimps and chatbots rolling around at about .1, and then a bunch of numbers small enough you want scientific notation to express them.

@steven:

Do you devote a significant amount of your time and resources to making paperclips, given the possibility that you're being simulated? If not, why would a paperclip-maximizer devote time and resources to human life?

steven: your "not 100% sure" is a perfect example of the problem eliezer is trying to explain. "not 100% sure that X is false" is not a valid excuse to waste thought on X if the prior improbability of X is as incredibly tiny as it is for thoughts like "paperclip maximizers will find their own paperclip-related reasons not to murder everyone".

This is something that's bothered me a lot about the free market. Many people, often including myself, believe that a bunch of companies which are profit-maximizers (plus some simple laws against use of force) will cause "nice" results. These people believe the effect is so strong that no possible policy directly aimed at niceness will succeed as well as the profit-maximization strategy does. There seems to be a lot of evidence for this. But it also seems too easy, as if you could take ten paper-clip maximizers competing to convert things into di... (read more)

2buybuydandavis12y
First, policies don't aim, actors with intent do. A journalistic peeve of mine. Newspaper writers generally spend the first 10 paragraphs of a story about legislation psycho analyzing the intent of pieces of paper, and rarely will tell you what the pieces of paper actually say. Second, I don't consider this a serious pro free market position. It's not that no "possible" government enforced policy would do better, it's that the political process is generally unlikely to yield a better policy.
0Manfred12y
Unfortunately, many people who hold this position don't know that it's not serious.
0buybuydandavis12y
I don't think that's an accurate characterization of Austrian economists.
0Manfred12y
Well, I think it's a quite accurate depiction of anyone who uses phrases like "a priori science." (That is, to the extent that Austrian economics is based on a priori reasoning, various claims about types of government intervention really are claims that no possible such government intervention could ever be good for people)
0buybuydandavis12y
A priori claims can be probabilistic claims.
0Manfred12y
Are you aware of a broad tradition of such probabilities that I'm completely unaware of?
0[anonymous]12y
It's really not at all mysterious if you understand the math. Much like how evolution can miraculously create complex life by maximizing "fitness" (i.e. offspring). Also, when you study the math, you will see the many assumptions that make the result go through. Much like evolution, it doesn't always turn out. Markets are stupid. I just googled to find a decent example of the math and this (pdf) is what I came up with. Looks pretty good, but there are many versions of this material available online.
-2[anonymous]12y
I just realized I responded to a very old comment, which I was lead to by its being the parent of a comment made today. Sigh. Well, hopefully someone finds the link above useful.

This posting, like many of Eliezer's, offers good advice for someone who is advancing the frontiers of human knowledge and breaking new ground. But most of us aren't in that situation. Do these considerations offer useful insights for the average person living his life? Or are they just abstract philosophy without practical import for most people?

It seems to me like the simplest way to solve friendliness is: "Ok AI, I'm friendly so do what I tell you to do and confirm with me before taking any action." It is much simpler to program a goal system that responds to direct commands than to somehow try to infuse 'friendliness' into the AI. Granted, marketing wise a 'friendliness' infused AI sounds better because it makes those who seek to build such AI seem altruistic. Anyone saying they intend to implement the former seems selfish and power hungry.

1Lapsed_Lurker12y
Wasn't that trick tried with Windows Vista, and people were so annoyed by continually being asked trivial "can I do this?" questions that they turned off the security?
"Do these considerations offer useful insights for the average person living [ver] life?"

I would say yes, the overcoming bias project is useful for laypeople--it's changed my life, at least. I don't read these posts as being about becoming a great scientist. I realize the irony of quoting "TMoLFAQ" in the comments to this post, but "[t]here's no such thing as science": rationality is about forming beliefs that are actually true, and plans that might actually work. Advancing the frontiers is a special case.

But it also seems too easy, as if you could take ten paper-clip maximizers competing to convert things into differently colored paperclips, and ended out with utopia.
That's essentially what happened with bacteria - they really don't have goals beyond converting the universe into bacteria. But through an endless process of exhausting resources, competition, and cooperation, they worked out different ways of dealing with each other, until they eventually became multicellular - and so the process began again. We've reached the point where superorganisms c... (read more)

No, let me guess, you'll be more productive if you're happy Deja vu.

Caledonian, HA has been discussing other superorganisms as possibly being conscious which Eliezer says evolution does not apply to (both could be right). What's your opinion on such matters?

No, let me guess, you'll be more productive if you're happy Deja vu.

Caledonian, HA has been discussing other superorganisms as possibly being conscious which Eliezer says evolution does not apply to (both could be right). What's your opinion on such matters?

No, let me guess, you'll be more productive if you're happy Deja vu.

Caledonian, HA has been discussing other superorganisms as possibly being conscious which Eliezer says evolution does not apply to (both could be right). What's your opinion on such matters?

A paperclip maximizer might keep humans around for a while (because, as of right now, we're the only beings around that make paperclips) but yeah, if it had enough power (magic nanotechnology, etc.), we'd most likely be gone.

It seems to me like the simplest way to solve friendliness is: "Ok AI, I'm friendly so do what I tell you to do and confirm with me before taking any action." It is much simpler to program a goal system that responds to direct commands than to somehow try to infuse 'friendliness' into the AI.

This has been addressed before. Basically, you'll get what you asked for, but it probably won't be what you really want.

Do you devote a significant amount of your time and resources to making paperclips, given the possibility that you're being simulated?

Keeping everyone alive would not take a significant amount of a paperclip maximizer's time and resources. (Though for utilitarians this probably means it doesn't count.) But the key difference is this: human-like goal systems seem like they will gain access to lots more simulation resources than paperclip maximizers or some other specific human-indifferent goal system (the set of all human-indifferent goal systems together is a different matter, but they're not a coherent bloc).

Caledonian, Those conditions weren't created in the laboratory, because the individual strategy dominated over the group; ergo, the conditions necessary for that to happen were not met.

I am not sure how to interpret this. When we remove whole groups from the gene pool because of some group characteristic (i.e. averaged over the population of that group), it sounds for me natural to call that a group selection. Do you have some different meaningful definition of group selection? What does it mean in general when you say that the individual strategy dominates?

It is much simpler to program a goal system that responds to direct commands than to somehow try to infuse 'friendliness' into the AI.

If the AI receives commands frequently it AI would be weak - and probably not very competitive. It would be like a child running to its mummy all the time. To make decisions fast, that sort of thing is not on the cards.

If the AI receives commands infrequently, that's more-or-less what is under discussion.

However, AIs can be expected to naturally defend their goals. It may be best not to provide a convenient interface fo... (read more)

I have a question for you Eliezer. When you were figuring out how powerful AIs made from silicon were likely to be, did you have a goal that you wanted? Do you want AI to be powerful so it can stop death?

Do these considerations offer useful insights for the average person living his life? Or are they just abstract philosophy without practical import for most people?

Good comment. I would really like to hear an answer to this.

-1David Althaus13y
To me Eliezer's writings were extremely helpful.

What's your opinion on such matters?
There's no logical necessity making HA's superorganisms unconscious in a human sense - and in some rudimentary sense they are conscious of their environment - but I don't happen to think that human social organizations have the right kind or right amount of complexity to be 'conscious'. They're more like slime mold fruitings or kelp at this point.

[misrepresentation deleted] Evolution will continue. But as the substrate of human-memetic organisms is so flexible, this actually limits their ability to pass on data in a... (read more)

Eliezer: You've got various people insisting that an arbitrary mind, including an expected paperclip maximizer, would do various nice things or obey various comforting conditions: "Keep humans around, because diversity is important to creativity, and the humans will provide a different point of view." Now you might want to seriously ask if, even granting that premise, you'd be kept in a nice house with air conditioning; or kept in a tiny cell with life support tubes and regular electric shocks if you didn't generate enough interesting ideas th... (read more)

Saying that grey goo will spread, and then never change or create new forms, is as mistaken as saying that single-celled organisms should never have given rise to multi-cellular organisms because competition between individuals is so stringent.

One of the things that Eliezer doesn't grasp is that optimization is not something evolution has generally produced because optimization is often maladaptive. What would appear to be an ideal strategy in the short term fails in the long term because overall environmental conditions have a tendency to change. Biolog... (read more)

Yvain: It must have something to do with capitalism including a term for the human utility function in the form of demand

Why, yes, I do think that has something to do with why the market builds houses with air conditioning instead of tiny little cells.

Hal: Do these considerations offer useful insights for the average person living his life? Or are they just abstract philosophy without practical import for most people?

Well, this particular abstract philosophy could end up having a pretty large practical import for all people, if they end up reprocess... (read more)

[Deleted. Caledonian, whenever you say something poorly reasoned or that misrepresents others' arguments, I am going to treat it as deliberate trolling and excise it. I've seen you do better, just keep it consistent.]

I said: "The reason I got interested in UIVs to start with is that I didn't have a good way to decide what counted as a good outcome."

So I have realized that perhaps the best prophylactic against anthropic optimism bias is, in fact, to be genuinely unsure of what outcome you think is best. If you don't have a predetermined idea of what to argue in favor of, then you don't have a preferred outcome to argue in favor of. Admittedly this is not something that one can always do, but in the specific case of trying to work out what kind of goals to prog... (read more)

TGGP, please see ni.codem.us for a brief response to your question that Eliezer will not permit.

To return to the implications of things Eliezer has said recently, please consider this

"Stop! Stop right there! You defeated yourself the moment you knew what you wanted! You need to throw away your thoughts and start over with a neutral forward extrapolation, not seeking any particular outcome."
in the context of Eliezer's search for 'Friendly AI'.

Eliezer, sharing causal parentage with us sounds like a plausible heuristic for ranking things in terms of similarity to us, but in many important senses an AI could share a great deal of causal parentage with us. So you still need a more detailed argument to rank AI low.

this particular abstract philosophy could end up having a pretty large practical import for all people

Eliezer:

Personally, I am not disputing the importance of friendliness. My question is, what do you think I should do about it?

If I were an AI expert, I would not be reading this blog since there is clearly very little technical content here.

My time would be simply too valuable to waste reading or writing popular futurism.

I certainly wouldn't post everyday, just to recapitulate the same material with minor variations (basically just killing time).

Of cour... (read more)

I second spindizzy, yet hope that something major is happening.

Eliezer, sharing causal parentage with us sounds like a plausible heuristic for ranking things in terms of similarity to us, but in many important senses an AI could share a great deal of causal parentage with us. So you still need a more detailed argument to rank AI low.

Which AI? A Friendly AI shares goal-systemic shape with us due to a direct causal link: humans successfully shaping the FAI. The law against deciding what you want, applies when you don't have control over the outcome - for intermediate cases, where you have partial control, you have t... (read more)

Perhaps we can use this defense of theory instinct as a simplified map of what we want the AI to do.

So create a paperclip maximizer that is (handwave) somehow restricted from doing anything that it's creator would try to convince people it would never do.

This is assuming that how we steer ourselves away from horrid ideas is simpler than how we decide we like something.

@Eliezer Yudkowsky said: Spindizzy and sophiesdad, I've spent quite a while ramming headlong into the problem of preventing the end of the world. Doing things the obvious way has a great deal to be said for it; but it's been slow going, and some help would be nice. Research help, in particular, seems to me to probably require someone to read all this stuff at the age of 15 and then study on their own for 7 years after that, so I figured I'd better get started on the writing now.

I have posted this before without answer, but I'll try again. You are working a... (read more)

James: See The Hidden Complexity of Wishes. You can't think of everything you'd need to ban.

sophiesdad: I don't see convincing anyone who matters in the government of the potential of AGI, let alone the need for Friendliness, as very likely. Tim Kyger (DOD employee) says "I don't know a soul in DoD or any of the services off the top of my head that has any inkling of the very existence of trans-H or of the various technical/scientific lanes of approach that are leading to a trans/post-human future of some sort. Zip. Zero. Nada." If you could get ... (read more)

Eliezer,

You should either: a) ban Caledonian; b) let him write whatever he wants.

Censoring his posts is kind of nasty, because it looks like he can only express opinions you think worth posting. Personally, I think you should choose (a), because his comments are boring, disruptive and useless, but if you don't wanna do it, then go for (b).

And as for this: "Research help, in particular, seems to me to probably require someone to read all this stuff at the age of 15 and then study on their own for 7 years after that, so I figured I'd better get started ... (read more)

It seems to me like the simplest way to solve friendliness is: "Ok AI, I'm friendly so do what I tell you to do and confirm with me before taking any action." It is much simpler to program a goal system that responds to direct commands than to somehow try to infuse 'friendliness' into the AI.

As was pointed out, this might not have the consequences one wants. However, even if that wasn't true, I'd still be leery of this option - this'd effectively be giving one human unlimited power. History has shown that people who are given unlimited power (or something close to it) tend to easily misuse it, even if they started out with good intentions.

I wanted to ban him, but other commenters requested that he be allowed to stay. So I haven't banned him, but I'm not going to let his trollings take over the comment threads either. Caledonian can write passable comments when he puts his mind to it; and if that's all that's allowed through, he has no motive to write anything else.

Whoever is censoring Caledonian: can it be done without adding the content-free nastiness (such as "bizarre objection", "illogic", and "gibberish")?

Pyramid Head:

There're plenty of smart guys out there, and if they have access to the proper literature, I'm sure you can find worthy contributors, instead of waiting 7 more years.

If you know of a good way to find such people beyond that what SIAI is already doing, do go ahead and tell us.

This idea may be contaminated by optimism, but to avoid the risk of destroying humanity with AI, would it not be sufficient to make the AI more or less impotent? If it were essentially a brain in a jar type of thing that showcased everything humanity could create in terms of intelligence without the disastrous options of writing its own code or having access to a factory for creating death-bots? I suppose this is also anthromorphizing the AI because if it were really that super-intelligent it could come up with a way to do its optimization beyond the constraints we think we are imposing. Surely building a tooth-less though possibly "un-Friendly" AI is a more attainable goal than building an unrestricted Friendly AI?

Constant: Done.

Boris: See That Alien Message.

As for the Manhattan Project, who do you think they're going to pick to lead it? Some young unknown with a mad brilliant idea? Or, say, Roger Schank? (I've got nothing against him personally, but he's pretty old-style.) Japan tried something like this with their Fifth Generation project. Didn't help them any.

When the basic theory is done, then I'll know if implementation requires a Manhattan Project or not. I don't think it will. AI done right is not about brute force.

Eliezer, I agree SF fiction writers find it far easier to just write AIs, and also true aliens, as just odd humans, as it is far more work to write a plausible intelligent non-human. I don't mean to be saying much more than the obvious point that an AI that was a "mind child" of human civilization would in that obvious sense share causal parentage with us. A whole brain emulation would have started out as a particular human brain, a hard coded AI would have started out as the sort of code a human would write, and an AI that evolved under pressu... (read more)

'It seems to me like the simplest way to solve friendliness is: "Ok AI, I'm friendly so do what I tell you to do and confirm with me before taking any action." It is much simpler to program a goal system that responds to direct commands than to somehow try to infuse 'friendliness' into the AI.'

As was pointed out, this might not have the consequences one wants. However, even if that wasn't true, I'd still be leery of this option - this'd effectively be giving one human unlimited power.

Would you expect all the AIs to work together under one person'... (read more)

Will Pearson: When you were figuring out how powerful AIs made from silicon were likely to be, did you have a goal that you wanted? Do you want AI to be powerful so it can stop death?

Eliezer: ..."Yes" on both counts....

I think you sidestepped the point as it related to your post. Are you're rationally taking into account the biasing effect your heartfelt hopes exert on the set of hypotheses raised to your concious attention as you conspire to save the world?

Are you're rationally taking into account the biasing effect your heartfelt hopes exert on the set of hypotheses raised to your concious attention as you conspire to save the world?
Surely you're not giving Eliezer enough credit, RI - he's put a lot of thought into this subject.

There's really no way to eliminate one's own biases without recourse to objective and impersonal tests, of course. You can't identify your own mistakes with armchair theorizing. But you have to give him credit for effort.

Are you're rationally taking into account the biasing effect your heartfelt hopes exert on the set of hypotheses raised to your conscious attention as you conspire to save the world?

Recovering, in instances like these, reversed stupidity is not intelligence; you cannot say, "I wish fast takeoff to be possible, therefore it is not".

Indeed. But you can, for example, say "I wish fast takeoff to be possible, so should be less impressed, all else equal, by the number of hypothesis I can think of that happen to support it".

Do you wish fast ... (read more)

RI, the point is not in consciously reevaluating the hypotheses to oppose the bias (=reverse stupidity), but in avoiding the wrong conclusion reaching the level of awareness, in repairing the outlook before it misguides intuition. Intuition is relatively blind, but it is the engine of human intelligence. Feed it with misinterpreted information, and it will turn out gibberish. You can't convert gibberish into sensible output, and you can't make the engine produce correct output by hacking it with a hammer. You need to feed it quality fuel.

@Kaj Sotala: I can't - I'm not smart enough :)

But seriously, do you really think that we ought to wait a decade before a brilliant researcher shows up? And it seems all the more suspicious because this brilliant researcher has to read Eliezer's material in a tender age, or else he won't be good enough.

Now don't get me wrong, I love Eliezer's posts here, and I've learned A LOT of stuff. And I also happen to think that he's onto something when he talks about Friendly AI (and AI in general). But I don't see how he can hope to save the world by writing blog posts...

@Pyramid Head: I don't see how he can hope to save the world by writing blog posts...

Ditto. Autodidactism may be a superior approach for the education of certain individuals, but it allows the individual to avoid one element crucial to production: discipline. Mr. Yudkowsky's approach, and his resistance to work with others, along with his views that it is his job to save the world, and no one else can do it, suggest an element of savantism. Hardly a quality one would want in a superhuman intelligence.

I, too, enjoy his writing, but the fact that he discove... (read more)

I disagree with the last 2 comments.

Eliezer's priority has gradually shifted over the last 5 years or so from increasing his own knowledge to transmitting what he knows to others, which is exactly the behavior I would expect from someone with his stated goals who knows what he is doing.

Yes, he has suggested or implied many times that he expects to implement the intelligence explosion more or less by himself (and I do not like that) but ever since the Summer of AI his actions (particularly all the effort he has put into blogging and his references to 15-to-... (read more)

Suppose we break the problem down into multiple parts.

1. Understand how the problem works, what currently happens.
2. Find a better way that things can work, that would not generate the same problems.
3. Find a way to get from here to there.
4. Do it.

Then part 1 might easily be aided by a guy on a blog. Maybe part 2. Possibly part 3.

A blog is better than a newsgroup because the threads don't scroll off, they're all sitting on the site's computer if anybody cares. Also, as old posts are replaced by new posts people stop responding to old posts. So there isn... (read more)

It would seem that the big development in our lifetimes has been the advent of the digital computer, the Turing Machine. Assuming all humans come into the world with no basic knowledge other than hard-wired reflexes, we must all gain our knowledge from those who have have preceded us, along with our own reflections about that knowledge and reflections on our environmental observations. The entire Library of Congress is available digitally. Using the concepts of trend analysis and Bayesian Probabilities (and others I don't know about), couldn't a properly ... (read more)

Well, is there really no one else in the world right now to work in this problem along with Eliezer (who, in my opinion, don't lack discipline)? I can't help but think that it's rather arrogant...

Well, that's one of the reasons I'm not a SIAI donor, though. Can't donate money to someone who write blogs instead of researching Friendly AI theory. And I'm not nearly smart enough to make any progress on my own, or even help someone else. So I guess mankind is screwed :)

Retired urologist: "properly programed"

This is the hard part.

@sophiesdad: Autodidactism may be a superior approach for the education of certain individuals, but it allows the individual to avoid one element crucial to production: discipline.

@Pyramid Head: Eliezer (who, in my opinion, don't lack discipline)

My comment about discipline was not meant to be inflammatory, nor even especially critical. Rather, it was meant to be descriptive of one aspect of autodidactism. In comparison, suppose that Mr. Yudkowsky was working toward his PhD at (say) University of Great Computer Scientists. His chosen topic for his dissertat... (read more)

Sophiesdad, you should be aware that I'm not likely to take your advice, or even take it seriously. You may as well stop wasting the effort.

@Eliezer: Sophiesdad, you should be aware that I'm not likely to take your advice, or even take it seriously. You may as well stop wasting the effort.

Noted. No more posts from me.

An unusually moderate and temperate exchange.

Eliezer- Have you written anything fictional or otherwise about how you envision an ideal post-fAI or post-singularity world? Care to share?

Oh... I should have read these comments to the end, somehow missed what you said to sophiesdad.

Eliezer... I am very disappointed. This is quite sad.

Well, heck. At least he's being honest. Maybe a little blunt, but definitely honest.

Ok- Eliezer- you are just a human and therefore prone to anger and reaction to said anger, but you, in particular, have a professional responsibility not to come across as excluding people who disagree with you from the discussion and presenting yourself as the final destination of the proverbial buck. We are all in this together. I have only met you in person once, have only had a handful of conversations about you with people who actually know you, and have only been reading this blog for a few months, and yet I get a distinct impression that you have ... (read more)

[-][anonymous]10y00

IMO, the last two posts, and especially this one, are some of the best on Less Wrong. It's a pity this post isn't included in any of the sequences.