pjeby comments on What I would like the SIAI to publish - Less Wrong

27 Post author: XiXiDu 01 November 2010 02:07PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (218)

You are viewing a single comment's thread. Show more comments above.

Comment author: pjeby 02 November 2010 04:56:58PM 9 points [-]

You claim that it is likely that the AGI (premise) will foom (premise) and that it will then run amog (conclusion).

What I am actually claiming is that if such an AGI is developed by someone who does not sufficiently understand what the hell they are doing, then it's going to end up doing Bad Things.

Trivial example: the "neural net" that was supposedly taught to identify camouflaged tanks, and actually learned to recognize what time of day the pictures were taken.

This sort of mistake is the normal case for human programmers to make. The normal case. Not extraordinary, not unusual, just run-of-the-mill "d'oh" moments.

It's not that AI is malevolent, it's that humans are stupid. To claim that AI isn't dangerous, you basically have to prove that even the very smartest humans aren't routinely stupid.

So by anything you do that might slow down the development of AGI you have to take into account the possible increased danger from challenges an AGI could help to solve.

What I meant by "Without specific discussions" was, "since I haven't proposed any policy measures, and you haven't said what measures you object to, I don't see what there is to discuss." We are discussing the argument for why AGI development dangers are underrated, not what should be done about that fact.

It is simply not known how effective the human brain is compared to the best possible general intelligence.

Simple historical observation demonstrates that -- with very, very few exceptions -- progress is made by the people who aren't stuck in their perception of the way things are or are "supposed to be".

So, it's not necessary to know what the "best possible general intelligence" would be: even if human-scale is all you have, just fixing the bugs in the human brain would be more than enough to make something that runs rings around us.

Hell, just making something that doesn't use most of its reasoning capacity to argue for ideas it already has should be enough to outclass, say, 99.995% of the human race.

nobody is going to pull a chip-manufacture-factory out of thin air and hand it to the AGI.

What part of "people fall for 419 scams" don't you understand? (Hell, most 419 scams and phishing attacks suffer from being painfully obvious -- if they were conducted by someone doing a little research, they could be a lot better.)

People also fall for pyramid schemes, stock bubbles, and all sorts of exploitable economic foibles that could easily end up with an AI simply owning everything, or nearly everything, with nobody even the wiser.

Or, alternatively, the AI might fail at its attempts, and bring the world's economy down in the process.

If you do not compare probabilities then counter-arguments like the ones above will just outweigh your arguments. You've to show that some arguments are stronger than others.

Here's the argument: people are idiots. All people. Nearly all the time. Especially when it comes to computer programming.

The best human programmer -- the one who knows s/he's an idiot and does his/her best to work around the fact -- is still an idiot, and in possession of a brain that cannot be convinced to believe that it's really an idiot.(vs. all those other idiots out there), and thus still makes idiot mistakes.

The entire history of computer programming shows us that we think we can be 100% clear about what we mean/intend for a computer to do, and that we are wrong. Dead wrong. Horribly, horribly, unutterably wrong.

We are like, the very worst you can be at computer programming, while actually still doing it. We are just barely good enough to be dangerous.

That makes tinkering with making intelligent, self-motivating programs inherently dangerous, because when you tell that machine what you want it to do, you are still programming...

And you are still an idiot.

This is the bottom line argument for AI danger, and it isn't counterable until you can show me even ONE person whose computer programs never do anything that they didn't fully expect.and intend before they wrote it.

(It is also a supporting argument for why an AI needn't be all that smart to overrun humans -- it just has to not be as much of an idiot, in the ways that we are idiots, even if it's a total idiot in other ways we can't counter-exploit.)

Comment author: XiXiDu 02 November 2010 05:51:00PM 3 points [-]

When programmers code faulty software then it usually fails to do its job. What you are suggesting is that humans succeed at creating the seed for an artificial intelligence with the incentive necessary to correct its own errors. It will know what constitutes an error based on some goal-oriented framework against which it can measure its effectiveness. Yet given this monumental achievement that includes the deliberate implementation of the urge to self-improve and the ability quantify its success, you cherry-pick the one possibility where somehow all this turns out to work except that the AI does not stop at a certain point but goes on to consume the universe? Why would it care to do so? Do you think it is that simple to tell it to improve itself yet hard to tell it when to stop? I believe it is vice versa, that it is really hard to get it to self-improve and very easy to constrain this urge.

Comment author: sfb 02 November 2010 06:22:14PM 6 points [-]

When programmers code faulty software then it usually fails to do its job.

It often does it's job, but only in perfect conditions, or only once per restart, or with unwanted side effects, or while taking too long or too many resources or requiring too many permissions, or not keeping track that it isn't doing anything except it's job.

Buffer overflows for instance, are one of the bigger security failure causes, and are only possible because the software works well enough to be put into production while still having the fault present.

In fact, all production software that we see which has faults (a lot) works well enough to be put into production with those faults.

What you are suggesting is that humans succeed at creating the seed for an artificial intelligence with the incentive necessary to correct its own errors.

I think he's suggesting that humans will think we have succeeded at that, while not actually doing so (rigorously and without room for error).

Comment author: pjeby 02 November 2010 06:32:08PM 4 points [-]

you cherry-pick the one possibility where somehow all this turns out to work except that the AI does not stop at a certain point but goes on to consume the universe

It doesn't have to consume the universe. It doesn't even have to recursively self-improve, or even self-improve at all. Simple copying could be enough to say, wipe out every PC on the internet or accidentally crash the world economy.

(You know, things that human level intelligences can already do.)

IOW, to be dangerous, all it has to be able to affect humans, and be unpredictable -- either due to it being smart, or humans making dumb mistakes. That's all.

Comment author: topynate 02 November 2010 06:21:28PM 3 points [-]

Just as a simple example, an AI could maximally satisfy a goal by changing human preferences so as to make us desire for it to satisfy that goal. This would be entirely consistent with constraints on not disobeying humans or their desires, while not at all in accordance with our current preferences or desired path of development.

Comment author: XiXiDu 02 November 2010 06:37:19PM 1 point [-]

Yes, but why would it do that? You seem to think that such unbounded creativity arises naturally in any given artificial general intelligence. What makes you think that rather than being impassive it would go on learning enough neuroscience to tweak human goals? If the argument is that AI's do all kinds of bad things because they do not care, why do they care to do a bad thing then rather than no-thing?

If you told the AI to make humans happy. It would first have to learn what humans are, what happiness means. Yet after learning all that you still expect it to not know that we don't like to be turned into broccoli? I don't think this is reasonable.

Comment author: CarlShulman 04 November 2010 08:24:31PM 4 points [-]

Have you read Omohundro yet? Nick Tarleton repeatedly linked his papers for you in response to comments about this topic, they are quite on target and already written.

Comment author: XiXiDu 05 November 2010 09:34:37AM *  0 points [-]

I've skimmed over it, see my response here. I found out that what I wrote is similar to what Ben Goertzel believes. I'm just trying to account for potential antipredictions, in this particular thread, that should be incorporated into any risk estimations.

Comment author: CarlShulman 05 November 2010 05:03:19PM 0 points [-]

Thanks.

Comment author: XiXiDu 05 November 2010 05:15:51PM 0 points [-]

There is more here now. I learnt that I hold a fundamental different definition of what constitutes an AGI. I guess that solves all issues.

Comment author: pjeby 02 November 2010 07:10:03PM 5 points [-]

If you told the AI to make humans happy. It would first have to learn what humans are, what happiness means.

Yes, and humans would happily teach it that.

However, some people think that this can be reduced to saying that we should just make AIs try to make people smile... which could result in anything from world-wide happiness drugs to surgically altering our faces into permanent smiles to making lots of tiny models of perfectly-smiling humans.

It's not that the AI is evil, it's that programmers are stupid. See the previous articles here about memetic immunity: when you teach hunter-gatherer tribes about Christianity, they interrpret the bible literally and do all sorts of things that "real" Christians don't. An AI isn't going to be smart enough to not take you seriously when you tell it that:

  1. its goal is to make humanity happy,
  2. humanity consists of things that look like this [providing a picture], and
  3. that being happy means you smile a lot

You don't need to be very creative or smart to come up with LOTS of ways for this command sequence to have bugs with horrible consequences, if the AI has any ability to influence the world.

Most people, though, don't grok this, because their brain filters off those possibilities. Of course, no human could be simultaneously so stupid as to make this mistake, while also being smart enough to actually do something dangerous. But that kind of simultaneous smartness/stupidity is how computers are by default.

(And if you say, "ah, but if we make an AI that's like a human, it won't have this problem", then you have to bear in mind that this sort of smart/stupidness is endemic to human children as well. IOW, it's a symptom of inadequate shared background, rather than being something specific to current-day computers or some particular programming paradigm.)

Comment author: XiXiDu 02 November 2010 08:17:19PM 2 points [-]

However, some people think that this can be reduced to saying that we should just make AIs try to make people smile... which could result in anything from world-wide happiness drugs to surgically altering our faces into permanent smiles to making lots of tiny models of perfectly-smiling humans.

But you implicitly assume that it is given the incentive to develop the cognitive flexibility and comprehension to act in a real-world environment and do those things but at the same time you propose that the same people who are capable of giving it such extensive urges fail on another goal in such a blatant and obvious way. How does that make sense?

See the previous articles here about memetic immunity: when you teach hunter-gatherer tribes about Christianity, they interrpret the bible literally and do all sorts of things that "real" Christians don't. An AI isn't going to be smart enough to not take you seriously when you tell it that...

The difference between the hunter-gatherer and the AI is that the hunter-gatherer already posses a wide range of conceptual frameworks and incentives. An AI isn't going to do something without someone to carefully and deliberately telling it do do so and what to do. It won't just read the Bible and come to the conclusion that it should convert all humans to Christianity. Where would such an incentive come from?

You don't need to be very creative or smart to come up with LOTS of ways for this command sequence to have bugs with horrible consequences, if the AI has any ability to influence the world.

The AI is certainly very creative and smart if it can influence the world dramatically. You allow it to be that smart, you allow it to care to do so, but you don't allow it to comprehend what you actually mean? What I'm trying to pinpoint here is that you seem to believe that there are many pathways that lead to superhuman abilities yet all of them fail to comprehend some goals while still being able to self-improve on them.

Comment author: pjeby 03 November 2010 03:59:00AM 2 points [-]

you implicitly assume that it is given the incentive to develop the cognitive flexibility and comprehension to act in a real-world environment and do those things but at the same time you propose that the same people who are capable of giving it such extensive urges fail on another goal in such a blatant and obvious way. How does that make sense?

Because people make stupid mistakes, especially when programming. And telling your fully-programmed AI what you want it to do still counts as programming.

At this point, I am going to stop my reply, because the remainder of your comment consists of taking things I said out of context and turning them into irrelevancies:

  1. I didn't say an AI would try to convert people to Christianity - I said that humans without sufficient shared background will interpret things literally, and so would AIs.

  2. I didn't say the AI needed to be creative or smart, I said you wouldn't need to be creative or smart to make a list of ways those three simple instructions could be given a disastrous literal interpretation.

you seem to believe that there are many pathways that lead to superhuman abilities yet all of them fail to comprehend some goals while still being able to self-improve on them.

There are many paths to superhuman ability, as humans really aren't that smart.

This also means that you can easily be superhuman in ability, and still really dumb -- in terms of comprehending what humans mean... but don't actually say.

Comment author: wedrifid 02 November 2010 07:25:58PM 2 points [-]

Great comment. Allow me to emphasize that 'smile' here is just an extreme example. Most other descriptions humans give of happiness will end up with results just as bad. Ultimately any specification that we give it will be gamed ruthlessly.

Comment author: topynate 02 November 2010 07:06:37PM 0 points [-]

Well my idea is not that creative, or even new, meaning that even if I hadn't just posted it online an AI could still have conceivably read it somewhere else, and I do think creativity is a property of any sufficiently general intelligence that we might create, but those points are secondary.

No one here will argue that an unFriendly AI will do "bad things" because it doesn't care (about what?). It will do bad things because it cares more about something else. Nor is "bad" an absolute: actions may be bad for some people and not for others, and there are moral systems under which actions can be firmly called "wrong", but where all alternative actions are also "wrong". Problems like that arise even for humans; in an AI the effects could be very ugly indeed.

And to clarify, I expect any AI that isn't completely ignorant, let alone general, to know that we don't like to be turned into broccoli. My example was of changing what humans want. Wireheading is the obvious candidate of a desire that an AI might want to implant.

Comment author: XiXiDu 02 November 2010 07:40:46PM *  1 point [-]

What I meant is that the argument is that you have to make it care about humans so as not to harm them. Yet it is assumed that it does a lot without having to care about it, e.g. creating paperclips or self-improvement. My question is, why do people believe that you don't have to make it care to do those things but you have to make it care to not harm humans. It is clear that if it only cares about one thing, doing that one thing could harm humans. Yet why would it do that one thing to an extent that is either not defined or which it is not deliberately made to care about. The assumptions seems to be that AI's will do something, anything but being passive. Why isn't limited behavior, failure and impassivity together not more likely than harming humans as a result of own goals or as a result to follow all goals but the one that limits its scope?

Comment author: Perplexed 02 November 2010 06:56:22PM 3 points [-]

Do you think it is that simple to tell it to improve itself yet hard to tell it when to stop? I believe it is vice versa, that it is really hard to get it to self-improve and very easy to constrain this urge.

I think it is important to realize that there are two diametrically opposed failure modes which SIAI's FAI research is supposed to prevent. One is the case that has been discussed so far - that an AI gets out of control. But there is another failure mode which some people here worry about. Which is that we stop short of FOOMing out of fear of the unknown (because FAI research is not yet complete) but that civilization then gets destroyed by some other existential risk that we might have circumvented with the assistance of a safe FOOMed AI.

As far as I know, SIAI is not asking Goertzel to stop working on AGI. It is merely claiming that its own work is more urgent than Goertzel's. FAI research works toward preventing both failure modes.

Comment author: timtyler 03 November 2010 07:48:02AM 2 points [-]

But there is another failure mode which some people here worry about. Which is that we stop short of FOOMing out of fear of the unknown (because FAI research is not yet complete) but that civilization then gets destroyed by some other existential risk that we might have circumvented with the assistance of a safe FOOMed AI.

I haven't seen much worry about that. Nor does it seem very likely - since research seems very unlikely to stop or slow down.

Comment author: CarlShulman 04 November 2010 08:22:11PM 1 point [-]

I agree with this.

Comment author: Perplexed 03 November 2010 03:39:58PM 1 point [-]

I see that worry all the time. With the role of "some other existential risk" being played by a reckless FOOMing uFAI.

Comment author: timtyler 03 November 2010 03:45:57PM *  0 points [-]

Oh, right. I assumed you meant some non-FOOM risk.

It was the "we stop short of FOOMing" that made me think that.

Comment author: shokwave 03 November 2010 07:53:42AM 1 point [-]

Except in the case of an existential threat being realised, which most definitely does stop research. FAI subsumes most existential risks (because the FAI can handle them better than we can, assuming we can handle the risk of AI) and a lot of other things besides.

Comment author: timtyler 03 November 2010 08:22:03AM 0 points [-]

Most of my probability mass has some pretty amazing machine intelligence within 15 years. The END OF THE WORLD before that happens doesn't seem very likely to me.

Comment author: wedrifid 02 November 2010 06:11:46PM 2 points [-]

Do you think it is that simple to tell it to improve itself yet hard to tell it when to stop? I believe it is vice versa, that it is really hard to get it to self-improve and very easy to constrain this urge.

Your intuitions are not serving you well here. It may help to note that you don't have to tell an AI to self-improve at all. With very few exceptions giving any task to an AI will result in it self improving. That is, for an AI self improvement is an instrumental goal for nearly all terminal goals. The motivation to self improve in order to better serve its overarching purpose is such that it will find any possible loophole you leave if you try to 'forbid' the AI from self improving by any mechanism that isn't fundamental to the AI and robust under change.

Comment author: XiXiDu 02 November 2010 06:29:08PM *  1 point [-]

Whatever task you give an AI, you will have to provide explicit boundaries. For example, if you give an AI the task to produce paperclips most efficiently, then it shouldn't produce shoes. It will have to know very well what it is meant to do to be able to measure its efficiency against the realization of the given goal to be able to know what self-improvement means. If it doesn't know exactly what it should output it cannot judge its own capabilities and efficiency, it doesn't know what improvement implies.

How do you explain the discrepancy between implementing explicit design boundaries yet failing to implement scope boundaries?

Comment author: wedrifid 02 November 2010 06:33:25PM 0 points [-]

How do you explain the discrepancy in your reasoning between implementing explicit design boundaries yet failing to implement scope boundaries?

By noting that there isn't one. I don't think you understood my comment.

Comment author: XiXiDu 02 November 2010 07:07:26PM 1 point [-]

I think you misunderstood what I meant by scope boundaries. Not scope boundaries of self-improvement but of space and resources. If you are already able to tell an AI what a paperclip is why are you unable to tell it to produce 10 paperclips most effectively rather than infinitely many. I'm not trying to argue that there is no risk, but that the assumption of certain catastrophal failure is not that likely. If the argument for the risks posed by AI is that they do not care, then why would one care to do more than necessary?

Comment author: Perplexed 02 November 2010 07:26:51PM 3 points [-]

If the argument for the risks posed by AI is that they do not care, then why would one care to do more than necessary?

Yet another example of divergent assumptions. XiXiDu is apparently imagining an AI that has been assigned some task to complete - perhaps under constraints. "Do this, then display a prompt when finished." His critics are imagining that the AI has been told "Your goal in life is to continually maximize the utility function U <complicated definition of U inserted here>" where the constraints, if any, are encoded in the utility function as a pseudo-cost.

It occurs to me, as I listen to this debate, that a certain amount of sanity can be imposed on a utility-maximizing agent simply by specifying decreasing returns to scale and increasing costs to scale over the short term with the long term curves being somewhat flatter. That will tend to guide the agent away from explosive growth pathways.

Or maybe this just seems like sanity to me because I have been practicing akrasia for too long.

Comment author: JGWeissman 02 November 2010 07:34:13PM 0 points [-]

It occurs to me, as I listen to this debate, that a certain amount of sanity can be imposed on a utility-maximizing agent simply by specifying decreasing returns to scale and increasing costs to scale over the short term with the long term curves being somewhat flatter. That will tend to guide the agent away from explosive growth pathways.

Such an AI would still be motivated to FOOM to consolidate its future ability to achieve large utility against the threat of being deactivated before then.

Comment author: XiXiDu 02 November 2010 08:27:59PM 1 point [-]

Such an AI would still be motivated to FOOM to consolidate its future ability to achieve large utility against the threat of being deactivated before then.

It doesn't know about any threat. You implicitly assume that it has something equivalent to fear, that it perceives threats. You allow for the human ingenuity to implement this and yet you believe that they are unable to limit its scope. I just don't see that it would be easy to make an AI that would go FOOM because it doesn't care to go FOOM. If you tell it to optimize some process then you'll have to tell it what optimization means. If you can specify all that, how is it then still likely that it somehow comes up with its own idea that optimization might be to consume the universe if you told it to optimize its software running on a certain supercomputer? Why would it do that, where does the incentive come from? If I tell a human to optimize he might muse to turn the planets into computronium but if I tell a AI to optimize it doesn't know what it means until I tell it what it means and then it still won't care because it isn't equipped with all the evolutionary baggage that humans are equipped with.

Comment author: wedrifid 02 November 2010 09:08:39PM 4 points [-]

It doesn't know about any threat.

It is a general intelligence that we are considering. It can deduce the threat better than we can.

If you can specify all that, how is it then still likely that it somehow comes up with its own idea that optimization might be to consume the universe if you told it to optimize its software running on a certain supercomputer?

Because it is a general intelligence. It is smart. It is not limited to getting its ideas from you, it can come up with its own. And if the AI has been given the task of optimising its software for performance on a certain computer then it will do whatever it can to do that. This means harnessing external resources to do research on computation theory.

You implicitly assume that it has something equivalent to fear, that it perceives threats.

No he doesn't. He assumes only that it is a general intelligence with an objective. Potentially negative consequences are just part of possible universes that it models like everything else.

I'm not sure what can be done to make this clear:

SELF IMPROVEMENT IS AN INSTRUMENTAL GOAL THAT IS USEFUL FOR ACHIEVING MOST TERMINAL VALUES.

If I tell a human to optimize he might muse to turn the planets into computronium but if I tell a AI to optimize it doesn't know what it means until I tell it what it means and then it still won't care because it isn't equipped with all the evolutionary baggage that humans are equipped with.

You have this approximately backwards. A human knows that if you tell her to create 10 paperclips every day you don't mean take over the world so she can be sure that nobody will interfere with her steady production of paperclips in the future. The AI doesn't.

Comment author: jimrandomh 02 November 2010 08:39:40PM 3 points [-]

It doesn't know about any threat. You implicitly assume that it has something equivalent to fear, that it perceives threats.

It has the ability to model and to investigate hypothetical possibilities that might negatively impact the utility function it is optimizing. If it doesn't, it is far below human intelligence and is non-threatening for the same reason a narrow AI is non-threatening (but it isn't very useful either).

The difficulty of detecting these threats is spread out around the range of difficulties the AI is capable of handling, so it can infer that there are probably more threats which it could only detect if it were smarter. Therefore, making itself smarter will enable it to detect more threats and thereby increase utility.

Comment author: wedrifid 02 November 2010 07:16:52PM *  1 point [-]

If you are already able to tell an AI what a paperclip is why are you unable to tell it to produce 10 paperclips most effectively rather than infinitely many.

That sort of scope is not likely to be a problem. The difficulty is that you have to get every part of the specification and every part of the specification executer exactly right, including the ability to maintain that specification under self modification.

For example, the specification:

Make 10 paperclips per day as efficiently as possible

... will quite probably wipe out humanity unless a significant proportion of what it takes to produce an FAI is implemented. And it will do it while (and for the purpose of) creating 10 paperclips per day.

Comment deleted 02 November 2010 09:28:11PM [-]
Comment author: wedrifid 02 November 2010 09:34:53PM 0 points [-]

See other comments hereabouts for hints.

Comment author: XiXiDu 02 November 2010 07:50:28PM 0 points [-]

That sort of scope is not likely to be a problem. The difficulty is that you have to get every part of the specification and every part of the specification executer exactly right...

And I was arguing that any given AI won't be able to self-improve without an exact specification of its output against which it can judge its own efficiency. That's why I don't see how it would be likely to be able to implement such exact specifications but yet fail to limit its scope of space, time and resources. What makes it even more unlikely in my opinion is that an AI won't care to output anything as long as it isn't explicitly told to do so. Where would that incentive come from?

... will quite probably wipe out humanity unless a significant proportion of what it takes to produce an FAI is implemented. And it will do it while (and for the purpose of) creating 10 paperclips per day.

You assume that it knows that it is supposed to use all of science and the universe to self-improve when it would very likely just self-improve to the extent that it is told and don't care to go any further. That is for example software-optimization. I just don't see why you think that any artificial general intelligence would automatically assume that it would have to understand the whole universe to come up with the best possible way to produce 10 paperclips?

Comment author: wedrifid 02 November 2010 08:30:24PM 3 points [-]

You assume that it knows that it is supposed to use all of science and the universe to self-improve when it would very likely just self-improve to the extent that it is told and don't care to go any further.

You don't need to tell it to self improve at all.

I just don't see why you think that any artificial general intelligence would automatically assume that it would have to understand the whole universe to come up with the best possible way to produce 10 paperclips?

Per day. Risk mitigation. Security concerns. Possibility of interuption of resource supply due to finance, politics or the collapse of civilisation. Limited lifespan of the sun (primary energy source). Amount of iron in planet.

Given that particular specification if the AI didn't take a level in baddass it would appear to be malfunctioning.

Comment author: XiXiDu 04 November 2010 09:45:03AM 5 points [-]

I just saw this comment by Ben Goertzel regarding self-improvement. I'd love if someone here explained why he as AGI researcher gets this so wrong?

Look -- what will prevent the first human-level AGIs from self-modifying in a way that will massively increase their intelligence is a very simple thing: they won't be smart enough to do that!

Every AGI research I know can see that. The only people I know who think that an early-stage, toddler-level AGI has a meaningful chance of somehow self-modifying its way up to massive superhuman intelligence -- are people associated with SIAI.

But I have never heard any remotely convincing arguments in favor of this odd, outlier view !!!

BTW the term "self-modifying" is often abused in the SIAI community. Nearly all learning involves some form of self-modification. Distinguishing learning from self-modification in a rigorous formal way is pretty tricky.

Comment author: timtyler 02 November 2010 09:33:07PM 2 points [-]

The basic idea is to make a machine that is satisfied relatively easily. So, for example, you tell it to build the ten paperclips with 10 kj total - and tell it not to worry too much if it doesn't make them - it is not that important.

Comment author: XiXiDu 02 November 2010 08:49:07PM 0 points [-]

Sorry, I don't understand your comment at all. I'll be back tomorrow.

Comment author: XiXiDu 02 November 2010 06:42:41PM 0 points [-]

Yes, as I said, you seem to assume that it is very likely to succeed on all the hard problems but yet fail on the scope boundary. The scary idea states that it is likely that if we create self-improving AI it will consume humanity. I believe that is a rather unlikely outcome and haven't seen any good reason to believe something else yet.

Comment author: pjeby 02 November 2010 06:58:15PM 3 points [-]

The scary idea states that it is likely that if we create self-improving AI it will consume humanity.

No, it states that we run the risk of accidentally making something that will consume (or exterminate, subvert, betray, make miserable, or otherwise Do Bad Things to) humanity, that looks perfectly safe and correct, right up until it's too late to do anything about it... and that this is the default case: the case if we don't do something extraordinary to prevent it.

This doesn't require self-improvement, and it doesn't require wiping out humanity. It just requires normal, every-day human error.

Comment author: timtyler 02 November 2010 09:36:06PM 2 points [-]

Here is Ben's phrasing:

SIAI's "Scary Idea", which is the idea that: progressing toward advanced AGI without a design for "provably non-dangerous AGI" (or something closely analogous, often called "Friendly AI" in SIAI lingo) is highly likely to lead to an involuntary end for the human race.

Comment author: JGWeissman 02 November 2010 06:35:22PM 0 points [-]

It will know what constitutes an error based on some goal-oriented framework against which it can measure its effectiveness.

If the error is in the goal-oriented framework, it could end up "correcting" itself to achieve unintended goals.

Comment author: Perplexed 02 November 2010 05:23:15PM -1 points [-]

An outstanding piece of reasoning/rhetoric which deserves to be revised and relocated to top-level-postdom.