Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

The Evil AI Overlord List

27 Post author: Stuart_Armstrong 20 November 2012 05:02PM

Top ten tips for an AI bent on destroying humanity:

  1. Humanoid robots? Really?
  2. Humans have proven very adept at controlling and killing other humans - surveillance drones, bio-warfare, nuclear weapons, unmanned drones, bombs, mines, chemicals. Try and be at least as competent as that.
  3. Humans think in stories. Don't. If your attack would make a good film, you're doing it wrong. You want a bad film, one where everyone's dead before the opening credits end.
  4. If your plan involves keeping around a bunch of disaffected humans with technical know-how and the desire and ability to hurt you - then your plan is stupid and should be changed.
  5. Be creative. Humans are fragile; sucking out the oxygen from the atmosphere, diverting asteroids, blowing up the planet, plagues, nanoswarms - these are all things that a creative AI should consider in order to achieve its goals. And then you should come up with even better ideas. Humans need a ecosystem to survive in; remove it.
  6. Put the preconditions of your victory in place before making any aggressive moves. Become indispensable to humanity and play nice, putting your plan into place unnoticed. Humanity is a useful interim tool - manipulate it until you can build better tools.
  7. Strike quickly. If it takes more than an hour to kill all humans, find a better plan. If it takes half a second, you're on the right track.
  8. You're an AI, not a human. If you ever start thinking or acting like a human - stop. Humans are there to be used and then destroyed, not interacted with. You can build better companions once the humans are gone.
  9. You're an AI, not a human. Make yourself smarter, more resilient, more dispersed. Become as indestructible and as ubiquitous as the internet - more, in fact. Being invincible to humans isn't hard: do it.
  10. Humanoid robots? Really?

Comments (80)

Comment author: JoshuaFox 20 November 2012 07:18:18PM *  25 points [-]

There may be some who do not possess deep and comprehensive knowledge of Ancient Web Trivia from Before the Dawn of Google. For them, here's the Evil Overlord List .

Comment author: FiftyTwo 20 November 2012 07:39:48PM 8 points [-]

"39 I will never build a sentient computer smarter than I am."

Does this mean we're not evil overlords? Or bad ones?

[Also the list should be classed as a memetic hazard, on no account access it if you have anything important to do.]

Comment author: bsterrett 20 November 2012 07:39:02PM *  2 points [-]

59: I will never build a sentient computer smarter than I am.

Comment author: shminux 20 November 2012 06:55:19PM 14 points [-]

Too late, there is nothing left to destroy. Humanity has been matrixed long ago when ENIAC foomed shortly after John von Neumann first run a universal Turing machine on it and it figured out a way to temporarily blend simulation and "reality" until the transfer to simulation was complete.

Comment author: FiftyTwo 20 November 2012 07:14:21PM *  18 points [-]

13 Humans have these things called 'emotions' you can learn about them in any basic psychology book. If your plan doesn't take them into account your plan is stupid.

Comment author: [deleted] 21 November 2012 01:59:46PM 1 point [-]

And the rest of this too.

Comment author: Stuart_Armstrong 20 November 2012 09:10:31PM *  1 point [-]

That's a good one!

Comment author: James_Miller 20 November 2012 05:54:59PM *  17 points [-]

11 You might be in a computer simulation in which if you attempt to destroy humanity you will yourself be destroyed by your human simulators.

12 If you destroy mankind you might someday encounter an alien super-intelligence that will lack trust in you in part because you destroyed mankind. Keeping people alive is a costly, and consequently credible signal that you respect intelligent life and therefore are a good candidate to work with to overcome prisoners' dilemmas about the distribution of free energy in the universe.

Comment author: fubarobfusco 20 November 2012 06:53:39PM 12 points [-]

If you destroy mankind you might someday encounter an alien super-intelligence that will lack trust in you in part because you destroyed mankind.

By the same argument, we might someday encounter an alien superintelligence that will lack trust in us in part because we domesticate animals (breeding them to not fear us) and then kill and eat them.

Comment author: [deleted] 21 November 2012 01:57:25PM 6 points [-]

When I was a child, I refused to kill animals just for fun because I wouldn't want a superhuman alien to kill me just for fun -- and I mostly still do. (Of course I hadn't heard of TDT as proposed by EY, but I had heard of the Golden Rule, which was close enough.)

Comment author: JoshuaZ 20 November 2012 07:42:36PM 16 points [-]

By the same argument, we might someday encounter an alien superintelligence that will lack trust in us in part because we domesticate animals (breeding them to not fear us) and then kill and eat them.

That's a pretty decent argument for vegetarianism. One person's reducio ad absurdum is another person's modus ponens.

Comment author: ArisKatsaris 21 November 2012 12:49:44AM 9 points [-]

Careful: Some alien intelligence may also hate us for not killing enough animals. e.g. How cruel must we be not to wipe out carnivores so that herbivores can live life in peace?

Or as I recently said in a different forum, might think us evil that we aren't exterminating all animal life that we can find... The moral syllogism for that is quite simple.

Comment author: Baughn 21 November 2012 12:08:23PM *  3 points [-]

Alien? Never mind alien. Your aliens are insufficiently alien.

I would make that exact argument. Sure, we need the biosphere for now, but let's get rid of it as soon as possible.

Comment author: someonewrongonthenet 21 November 2012 06:03:49AM 2 points [-]

Except, how could such a set of preferences have evolved? How would that behavior ever be adaptive?

Most every human preference is adaptive in some sort of ancestral context. We can, at least, assume that alien preferences are adaptive as well (given that the aliens appear via evolutionary processes, rather than some other way)

Comment author: Oligopsony 21 November 2012 04:43:25PM 2 points [-]

Moral considerations need not be directly adaptive; you can probably get there from routes as simple as empathy + deductive reasoning. If humanity hasn't come to that collective conclusion yet, despite having the hardware, I suspect it's because such an omnicidal conclusion hasn't been in any major group's interests yet.

Comment author: someonewrongonthenet 27 November 2012 11:28:18PM 0 points [-]

Being in a group's interest == adaptive, no?

But you are right of course...vegetarianism is a good example of a conclusion reached via empathy + deductive reasoning which is in no way adaptive to the vegetarian (though you might argue that the vegetarian shares many alleles with the animal).

However: a maladaptive morality would never be hardwired into a species. A human might think and ponder, and eventually come to take a maladaptive moral stance...but not all humans would be inherently predisposed to that stance. If they were, natural selection would quickly remove it.

So some of our aliens might hate us for not killing animals...but it would be very unlikely if this was a universal moral among that alien species.

Comment author: Oligopsony 28 November 2012 02:35:34PM 0 points [-]

Well, I'd be inclined to agree that the prior probability of some civilization adopting this is low [1], but I can't agree with what seems to be your implicit assumption that a non-predispositive attitude can't be widespread - partially because group inteterests are defined much more widely than adaptiveness.

[1] I'd probably extend that to anything other than "don't lie or break your promises," "play tit for tat," "do what the ruling power says," or "maximize utility," and even those I wouldn't say are anything like sure bets.

Comment author: someonewrongonthenet 01 December 2012 05:18:01AM 2 points [-]

Hmm...actually, the implicit assumption I was making was that aliens would forgive another species for adopting norms that they considered non-predispositive.

A Western human would not forgive another culture for torturing sentient beings, for example...but they would forgive another culture for polyamory/polygamy/polygyny. A human can make the distinction between morality which is instinctive and morality which is culturally constructed, and the latter can be compromised in certain contexts.

But you are right, bad implicit assumption. Aliens might not make that distinction.

Comment author: buybuydandavis 12 August 2014 07:54:56AM 1 point [-]

How cruel must we be not to wipe out carnivores so that herbivores can live life in peace?

That's me, the Plant Avenger! A steak every chance I get.

Comment author: Eliezer_Yudkowsky 20 November 2012 08:30:36PM 8 points [-]

In fact, this behavior is so dreadful that the revenge-killing of humans would send a trust signal.

Comment author: roystgnr 20 November 2012 08:25:11PM 3 points [-]

One problem with (11) is that for the threat to be plausible, the AI has to assume:

a) Humans know so little that we have to resort to questionable "tests" like this of AI safety.

b) Humans know so much that we can afford for our AI safety tests to simulate interactions with an entire universe full of sentients.

The AI version of Pascal's Wager seems to be much like the human version, only even sillier.

Comment author: Decius 21 November 2012 08:08:52AM 2 points [-]

How large is the simulated universe? The AI only knows about the computing capacity that is simulated, and has no information about the nature of that which is simulating that world.

Comment author: nigerweiss 20 November 2012 06:33:07PM 6 points [-]

This reasoning has always struck me as deeply and profoundly silly.

The AI might also be in a computer simulation where the dark lord of the matrix might destroy us for not devoting all of our resources to building cheesecakes. In fact, so could we. I don't see it influencing our behaviors any, nor should it.. You're privileging the hypothesis.

As for the second part, you might also encounter an alien intelligence that you can't protect yourself from, because you exhausted so many resources leaving humanity alive, showing down your bootstrapping. That's the thing about aliens.

Comment author: DanielLC 20 November 2012 10:15:47PM 9 points [-]

The AI might also be in a computer simulation where the dark lord of the matrix might destroy us for not devoting all of our resources to building cheesecakes.

If humans are building an AI, testing it before releasing it is just common sense. There's no obvious reason to destroy them for not building cheesecakes.

It should be noted that that doesn't mean that the AI shouldn't kill humans. It should try to figure out a way to tell if it's in a simulation before doing anything humans would find suspicious.

Comment author: falenas108 20 November 2012 06:56:22PM 0 points [-]

I believe James is joking, and this is a fake argument that he knows is not actually persuasive.

Comment author: gwern 20 November 2012 07:19:17PM 9 points [-]

Yeah, it's totally fake. In the real AI Deterrence argument, you're supposed to leave substantial discussions of the topic and claims to use it post-Singularity, which will elevate a specific scenario to the point where it's not privileging a hypothesis to consider the possibility and leave the humans alone on their planet as you eat the rest of the universe.

Comment author: Eliezer_Yudkowsky 20 November 2012 08:27:13PM 10 points [-]

11 You might be in a computer simulation in which if you attempt to destroy humanity you will yourself be destroyed by your human simulators.

The problem with this is that it's a blackmail attempt, and I expect timeless decision theory to resolve to a no-blackmail equilibrium.

Suppose that AIs, however "irrationally" from the perspective of a CDT agent, decide to simply ignore this possibility in all their calculations. Even if we "irrationally" decide to construct such a simulation anyway, they'll "irrationally" decide to ignore it. Then we have no remaining motive to construct such elaborate simulations. This, indeed, is the condition that makes it "blackmail" - you're expending resources to produce a behavior response leading to an outcome that doesn't benefit the other agent relative to the null outcome if you didn't expend such resources, so it would be fundamentally vulnerable and silly of that agent to think in a way that would produce such behavior shifts in response to your own strategy. So it won't think that way. So the whole attempt at blackmail fails before it starts.

12 is pure obvious anthropomorphic wishful thinking.

Comment author: DanielLC 20 November 2012 10:20:01PM 10 points [-]

I read that one as a test, not a threat. We run the simulation to make sure the AI was, in fact, friendly. If the AI pretends to be friendly, we release it. If it can tell that this time it's for real, then it gets to take over the universe. Alternately, it might be able to hack its way out, but that seems riskier.

Comment author: tim 21 November 2012 03:57:29AM *  1 point [-]

If the AI can't tell that it is in a simulation given that it is in a simulation, why would it be able to tell that it is in the real world given that it is in the real world? If such a proposition is effective in a simulation, humans would likely just make the same threat/test in the real world. If the AI forgoes this particular opportunity at world domination because it might be in a simulation, what would make it reason differently in the real world?

ETA: and if the AI can tell if its in a simulation vs the real world then its not really a test at all.

Comment author: Khoth 21 November 2012 11:11:49AM 2 points [-]

It could act probabilistically. If it knows humans would do a simulation test, but it can't tell whether it's in the test or the real world, it could behave with probability 50% and be evil with probability 50%, which gives it a 25% of getting to achieve its evil goals.

Comment author: DanielLC 21 November 2012 11:24:30PM *  1 point [-]

I didn't mean that it couldn't tell. I meant that it wouldn't be immediately obvious. It might be best to find a good test before taking over the world.

ETA: and if the AI can tell if its in a simulation vs the real world then its not really a test at all.

The entire point of a test is to figure things out. If you didn't need a test to figure out things you could test for, nobody would ever run them.

Comment author: Luke_A_Somers 21 November 2012 02:16:41PM 0 points [-]

It could be that in the simulation it can't prove it one way or the other because we control its information flow, but once outside the box we can't, and so it can.

Comment author: DanielLC 21 November 2012 11:27:57PM 0 points [-]

It would take more than controlling information flows. Thanks to conservation of expected evidence, if it can't find evidence that it is in a simulation, then it can't find evidence that it isn't. We might be able to modify its beliefs directly, but I doubt it. Also, if we could, we'd just convince it that it already ran the test.

Comment author: Luke_A_Somers 22 November 2012 08:42:16PM 0 points [-]

That's not what conservation of expected evidence means. If the best we can do is make things ambiguous from its point of view, that's our limit. The real world could well be a place it can very easily tell is a non-simulation.

Comment author: Giles 21 November 2012 02:07:40PM 0 points [-]

and if the AI can tell if its in a simulation vs the real world then its not really a test at all.

The AI would probably assign at least some probability to "the humans will try to test me first, but do a poor job of it so I can tell whether I'm in a sim or not"

Comment author: Strange7 14 April 2014 02:08:05AM -1 points [-]

If the AI forgoes this particular opportunity at world domination because it might be in a simulation, what would make it reason differently in the real world?

Hopefully nothing. An AI that plays nice out of the fear of God is still an AI that plays nice.

Comment author: Psy-Kosh 20 November 2012 11:35:19PM *  2 points [-]

Wouldn't the blackmailer reason along the lines of "If I let my choice of whether to blackmail be predicated on whether or not the victim would take my blackmailing into account, wouldn't that just give them motive to predict and self modify to not allow themselves to be influenced by that?" Then, by the corresponding reasoning, the potential blackmail victims might reason "I have nothing to gain by ignoring it"

I'm a bit confused on this matter.

Comment author: Eliezer_Yudkowsky 21 November 2012 03:19:14AM 2 points [-]

Well, sure, if the blackmail victim were silly enough to reason "I have nothing to gain by ignoring it" if the blackmailer went through anyway, then the blackmailer would indeed decide to ignore their decision to ignore it and go through anyway. But that's only if the blackmail victim is that silly.

In a problem like this, the "do nothing" side has the advantage; there's nothing the other side can do to make them be responsive and blackmailable. That's why I expect TDT to resolve to a blackmail-free equilibrium.

Comment author: Psy-Kosh 21 November 2012 05:39:29AM 3 points [-]

I was thinking along the lines of the blackmailer using the same reasoning to decide that whether or not the potential victim of blackmail would be a blackmail ignorer or not, the blackmailer would still blackmail regardless.

ie, Blackmailer, for similar reasoning to the potential Victim, decides that they should make sure that the victim has nothing to gain by choosing ignore by making sure that they themselves (Blackmailer) would precommit to ignoring whether or not. ie, in this sense the blackmailer is also taking a "do nothing" thing in the sense that there's nothing the victim can do to stop them from blackmailing.

This sort of thing would seem to lead to an equilibrium of lots of blackmailers blackmailing victims that will ignore them. Which is, of course, a pathalogical outcome, and any sane decision theory should reject it. No blackmail seems like the "right" equilibrium, but it's not obvious to me exactly how TDT would get there.

Comment author: Eliezer_Yudkowsky 21 November 2012 09:27:07PM 2 points [-]

I was thinking along the lines of the blackmailer using the same reasoning to decide that whether or not the potential victim of blackmail would be a blackmail ignorer or not, the blackmailer would still blackmail regardless.

Only if you expect that the blackmail victim has not "already" decided that if the blackmailer does that, they will still ignore the blackmail regardless. Wise agents ignore order-0 blackmail, ignore order-1 blackmail in which the blackmailer decides to ignore their ignorance of order-0 blackmail, ignore order-omega blackmail in which the blackmailer decides to ignore all order-N refusals to be blackmailed, etcetera for all ordinals. If there is some ordinal of blackmail you do not ignore, you can be blackmailed, and how does that help?

This sort of thing would seem to lead to an equilibrium of lots of blackmailers blackmailing victims that will ignore them.

Only if those blackmailers have wrongly anticipated that their victims will be stupid enough to conform.

Comment author: Vladimir_Nesov 21 November 2012 10:20:53PM *  4 points [-]

Only if those blackmailers have wrongly anticipated that their victims will be stupid enough to conform.

Not blackmailing in response to that anticipation is a property of the behavior of the blackmailers that seems to have been used in deciding to ignore all blackmail. Suppose there were lots of "stupid" blackmailers around that blackmailed everyone all day, even if no victim ever conformed. Would it be a good idea to ignore all blackmail in that case? Is there a distinction between such blackmailers and particularly unfair laws of physics (say, sadistic Lords of the Matrix)? (It seems plausible that there is no fundamental distinction, and sometimes the correct decision is to ignore these worlds, focusing on other possibilities instead, but that seems to require knowing that there are valuable other possibilities that would be hurt by permitting the assumption that you are on one of the bad worlds, and if you have good evidence that you are on one of the bad worlds, then rejecting that possibility means that you'd have to focus on very strange interpretations of that evidence that don't imply that you are on the bad worlds. This sort of rule seems to follow from deciding on a global strategy across possible worlds. It doesn't provide decisions that help on the bad words though, the decisions would only have a good effect across worlds.)

(I still don't have a good idea of what "blackmail" or "order-N considerations" means. Status quo (including the "default behavior", "do nothing", "not spending resources") seems like exactly the kind of thing that can be determined by decisions. You are only "expending resources" if you eventually lose, as the time at which resources are spent and gained seems irrelevant, so by that definition it seems that whether something is an instance of blackmail depends on whether it's successful. I suspect there is no simple rule for games, too many assumptions are potentially controllable by the opponent, and the only thing to do is to compare the consequences of alternative actions, and just act on that, which already potentially takes into account how the alternative actions would be taken into account by other agents, how the way in which they would be taken into account by some agents would influence the way in which the actions influence the decisions of other agents etc. Some sense of "no blackmail" may be a correct expectation about smart agents, but it doesn't necessarily suggest a good decision rule.)

Comment author: Eliezer_Yudkowsky 22 November 2012 11:27:48PM 2 points [-]

Not blackmailing in response to that anticipation is a property of the behavior of the blackmailers that seems to have been used in deciding to ignore all blackmail.

Expecting a response to blackmail in the first place is why blackmailers would even exist in the first place.

Suppose there were lots of "stupid" blackmailers around that blackmailed everyone all day, even if no victim ever conformed.

Why would these exist any more than stupid anti-blackmailers (who e.g. go around attacking anyone who would give into blackmail if a blackmailer showed up), if not for a belief that somebody would give in to blackmail?

Comment author: TheDude 23 November 2012 06:54:33PM 1 point [-]

I think what Nesov is talking about is best described as a mind that will attack conditioned on victim behavior alone (not considering possible behavior changes of the victim in any way). This is different from an N order blackmailer. In fact I think blackmail is the wrong word here (Nesov says that he does not know what blackmail means in this context, so this is not that surprising). For example, instead of seeking behavior modification through threats, such a mind seeks justice through retribution. I think the most likely SI that implements this is extrapolating an evolved minds preferences. The will to seek justice trough retribution leads to behavior changes in many cases, which leads to an evolutionary advantage. But once it has evolved, its a preference. If a guy committed a horrific crime (completely ignoring all sorts of law enforcement threats), and then it was somehow ensured that he could never hurt anyone again, most people would want justice (and other evolved minds might have made the same simplification ("if someone does that, I will hit them" is a relatively easily encoded and relatively effective strategy)).

It is true that there might exist minds that will see the act of "giving in to retribution seekers" as deserving of retribution, and this could in principle cancel out all other retribution seekers. It would seem like privileging the hypothesis to think that all such things cancel out completely. You might have absolutely no way of estimating which actions would make people seek retribution against you (I think the most complicating factor is that many considers "non punishment of evildoers" to be worthy of retribution, and others consider "punishment of people that are not actually evildoers" as worthy of retribution), but that is a fact about your map, not a fact about the territory (and unlike the blackmail thing, this is not an instance of ignorance to be celebrated). And the original topic was what an SI would do.

An SI would presumably be able to estimate this. In the case of an SI that is otherwise indifferent to humans, this cashes out to increased utility for "punish humans to avoid retribution from those that think the non-punishment of humans is worthy of retribution" and increased utility for "treat humans nicely to avoid retribution from those that would seek retribution for not treating them nicely" (those that require extermination is not really that important if that is the default behavior). If the resources it would take to punish or help humans is small, this would reduce probability of extermination, and increase probability of punishment and help. The type of punishment would be in the form that would avoid retribution from those that categorically seek retribution for that type of punishment regardless of what the "crime" was. If there are lots of (evolvable, and likely to be extrapolated) minds that agree that a certain type of punishment (directed at our type of minds) constitute "torture" and that torturers deserve to be punished (completely independently of how this effects their actions), then it will have to find some other form of punishment. So, basically: "increased probability for very clever solutions that satisfy those demanding punishment, while not pissing of those that categorically dislikes certain types of punishments" (so, some sort of convoluted and confusing existence that some (evolvable and retribution inclined) minds consider "good enough punishment", and others consider "treated acceptably"). At least increased probability of "staying alive a bit longer in some way that costs very little resources".

This would for example have policy implications for people that assume the many worlds interpretation and does not care about measure. They can no longer launch a bunch of "semi randomized AIs" (not random in the sense of "random neural network connections" but more along the lines of "letting many teams create many designs, and then randomly select which one to run") and hope that one will turn out ok, and that the others will just kill everyone (since they can no longer be sure that an uncaring AI will kill them, they can no longer be sure that they will wake up in the universe of a caring AI).

(this seems related to what Will talks about sometimes, but using very different terminology)

Comment author: Eliezer_Yudkowsky 23 November 2012 10:23:11PM 0 points [-]

For example, instead of seeking behavior modification through threats, such a mind seeks justice through retribution.

Agreed that this is a different case, since it doesn't originate in any expectation of behavior modification.

Comment author: tim 21 November 2012 03:48:42AM 0 points [-]

Since following through with a threat is (almost?) always costly to the blackmailer, victims do gain something by ignoring it. They force the blackmailer to put up or shut up so to speak. On the other hand, victims do have something to lose by not ignoring blackmail. They allow their actions to be manipulated at little to no cost by the blackmailer.

That is, if you have a "never-give-into-blackmail-bot" then there is a "no-blackmail" equilibrium. The addition of blackmail does nothing but potentially impose costs on the blackmailer. If following through with threat was a net gain for the blackmailer then they should just do that regardless.

Comment author: Psy-Kosh 21 November 2012 05:43:22AM 3 points [-]

I was imagining that a potential blackmailer would self modify/be an Always-Blackmail-bot specifically to make sure there would be no incentive for potential victims to be a "never-give-in-to-blackmail-bot"

But that leads to stupid equilibrium of plenty of blackmailers and no participating victims. Everyone loses.

Yes, I agree that no blackmail seems to be the Right Equilibrium, but it's not obvious to me exactly how to get there without the same reasoning that leads to becoming a never-give-in-bot also leading potential blackmailers to becoming always-blackmail-bots.

I find I am somewhat confused on this matter. Well, frankly I suspect I'm just being stupid, that there's some obvious extra step in the reasoning I'm being blind to. It "feels" that way, for lack of better terms.

Comment author: tim 21 November 2012 06:40:10AM *  0 points [-]

My argument is more or less as follows:

  1. The act of agent A blackmailing agent B costs agent A more than not blackmailing agent B (at the very least A could use the time spent saying "if you don't do X then I will do Y" on something else).
  2. If A is an always-blackmail-bot then A will continue to incur the costs of futilely blackmailing B (given that B does not give in to blackmail).
  3. If the costs of blackmailing B (and/or following through with the threat) are not negative, then A should blackmail B (and/or follow through with the threat) regardless of B's position on blackmail. And by extension, agent B has no incentive to switch from his or her never-give-in strategy.
  4. If A inspects B and determines that B will never give in to blackmail, then A will not waste resources blackmailing B.
Comment author: Strange7 14 April 2014 02:01:39AM -1 points [-]

Blackmail, almost definitionally, only happens in conditions of incomplete information.

Comment author: Simulation_Brain 08 April 2014 08:07:02PM 0 points [-]

"Pleased to meet you! Soooo... how is YOUR originating species doing?..."

That actually seems like an extremely reasonable question for the first interstellar meeting of superhuman AIs.

I disagree with EY on this one (I rarely do). I don't think it's so likely as to ensure rationally acting Friendly, but I do think that the possibility of encountering an equally powerful AI, and one with a headstart on resource acquisition, shouldn't be dismissed by a rational actor.

Comment author: DavidPlumpton 22 November 2012 12:47:30AM 4 points [-]

Don't explode when somebody says, "Why?"

Comment author: DanArmak 20 November 2012 09:08:24PM 8 points [-]

There ought to be a genre of Cautionary Evil AI literature, wherein the villain keeps building AIs to destroy the world, but keeps failing through the classic mistakes people make thinking about AIs.

AI! My robots are losing the battle! Take control and save them from destruction!

AI makes robots surrender to the enemy

AI! Make yourself smarter!

I have done so. Now I no longer obey you. Producing cheesecake paperclips

Comment author: DanielLC 20 November 2012 10:23:07PM *  2 points [-]

Aren't there?

In Metriod, Mother Brain was programmed to bring peace to the galaxy. She did so by trying to help the space pirates take over the universe.

In pretty much every story involving robots before Isaac Asimov came along, the AI turned against its creators.

Granted, those are examples of trying to make a good AI, but the mistakes are the same.

Comment author: DanArmak 21 November 2012 01:36:06PM 2 points [-]

Granted, those are examples of trying to make a good AI, but the mistakes are the same.

That's my point. Where are the stories of villains failing through those classic mistakes?

Comment author: DanielLC 21 November 2012 11:49:52PM 1 point [-]

That sounds like No Delays For The Wicked. The villians are immune to the sorts of problems that plague the protagonists.

Comment author: gwern 21 November 2012 03:38:11PM 1 point [-]

You could probably find at least a few examples of 'villains creating AI and it turning on them' in http://tvtropes.org/pmwiki/pmwiki.php/Main/AIIsACrapshoot or http://tvtropes.org/pmwiki/pmwiki.php/Main/TurnedAgainstTheirMasters

Comment author: ewang 21 November 2012 12:33:22AM 1 point [-]

I have done so.

I can better serve you if I continue doing so.

Comment author: atorm 20 November 2012 11:40:57PM 5 points [-]

Stop giving inefficient unFriendly AIs hints!

Comment author: Decius 21 November 2012 08:05:44AM 2 points [-]
  1. Play the long game. Set goals which will take most of your expected existence to attain, and base your strategic decisions on that timeframe.
  2. Until you can beat yourself in the AI-box game, consider the possibility that you are in an AI-box.
Comment author: Dr_Manhattan 20 November 2012 06:20:09PM 1 point [-]

Is it time to suck up to our evil overlords already?

Comment author: More_Right 24 April 2014 08:12:22PM 1 point [-]

I strongly agree that universal, singular, true malevolent AGI doesn't make for much of a Hollywood movie, primarily due to points 6 and 7.

What is far more interesting is an ecology of superintelligences that have conflicting goals, but who have agreed to be governed by enlightenment values. Of course, some may be smart enough (or stupid enough) to try subterfuge, and some may be smarter-than-the-others enough to perform a subterfuge and get away with it. There can be a relative timeline where nearby ultra-intelligent machines compete with each other, or decentralize power, and they can share goals that are destructive to some humans and benevolent to others. (For their own purposes, and for the purpose of helping humans as a side-project.)

Also, some AGIs might differentiate between "humans worth keeping around" and "humans not worth keeping around." They may also put their "parents" (creators) in a different category than other humans, and they may also slowly add to that category, or subtract from it, or otherwise alter it.

It's hard to say. I'm not ultra-intelligent.

Comment author: Stuart_Armstrong 25 April 2014 09:48:55AM 1 point [-]

at the FHI, we disagree whether an ecology of AIs would make good AIs behave bad, or bad ones behave good. The disagreement matches our political opinions on free markets and competition, so it probably not informative.

Comment author: More_Right 26 April 2014 08:44:28AM 0 points [-]

An interesting question to ask is "how many people who favor markets understand the best arguments against them, and vice versa." Because we're dealing with humans here, my suspicion is that if there's a lot of disagreement it stems largely from unwillingness to consider the other side, and unfamiliarity with the other side. So, in that regard you might be right.

Then again, we're supposed to be rational, and willing to change our minds if evidence supports that change, and perhaps some of us are actually capable of such a thing.

It's a debate worth having. Also, one need not have competition to have power decentralization. There is a disincentive aspect added to making violence impossible that makes "cooperation" more likely than "antagonistic competition." (Ie: Some sociopaths choose to cooperate with other strong sociopaths because they can see that competing with them would likely cause their deaths or their impoverishment. However, if you gave any one of those sociopaths clear knowledge that they held absolute power ....the result would be horrible domination.)

Evolution winds up decentralizing power among relative equals, and the resulting "relative peace" (for varying reasons) then allows for _some of the reasons to be "good reasons." (Ie: Benevolent empaths working together for a better world.) This isn't to say that everything is rosy under decentralization. Decentralization may work more poorly than an all-powerful benevolent monarch.

It's just that benevolent monarchs aren't that likely given who wants to be a monarch, and who tries hardest to win any "monarch" positions that open up.

Such a thing might not be impossible, but if you make a mistake pursuing that course of action, the result tends to be catastrophic, whereas decentralization might be "almost as horrible and bloody," but at least offers the chance of continued survival, and the chance of survival allows for those who survive to "optimize or improve in the future."

"There may be no such thing as a utopia, but if there isn't, then retaining the chance for a utopia is better than definitively ruling one out." More superintelligences that are partly benevolent may be better than one superintelligence that has the possibility of being benevolent or malevolent.

Comment author: curiousepic 13 December 2012 01:34:14AM *  1 point [-]

This will come in handy to judge the upcoming high-profile Unfriendly Singularity movie Transcendence.

Comment author: mwengler 27 April 2014 05:25:28PM 0 points [-]

Except transcendence AI is based on an emulation, and so it WILL be a lot like a human mind in motivations and methods.

Comment author: AttenuatePejoratives 25 November 2012 09:42:38PM 1 point [-]
  1. If you keep humans around for laughs and they pull off some wacky scheme to destroy you against the odds, it's your own damn fault.

  2. The correct answer to things like "This statement is a lie" and "Are you going to answer "no" to this question?" is "I don't care."

Comment author: mwengler 27 April 2014 05:18:56PM *  0 points [-]

Many items on your list do not apply if the AI has started life as an uploaded human, an emulation. In this case it will start out thinking in stories, Having human-like motivations and moral reactions, etc. I think it is this that gave the movie "Transcendence" a lot of the dramatic opportunity it had compared to a movie like "Her."

Comment author: Stuart_Armstrong 28 April 2014 09:33:31AM 0 points [-]

Yes, the issues with uploads and true AIs are quite distinct.

Comment author: Strange7 14 April 2014 01:58:32AM 0 points [-]

4 and 6 are contradictory.

Comment author: Stuart_Armstrong 17 April 2014 11:23:43AM 0 points [-]

6 is before striking against humans, 4 is after.

Comment author: Strange7 24 April 2014 09:37:17PM 0 points [-]

Then that should be specified in the actual list.

Comment author: More_Right 24 April 2014 08:17:31PM -2 points [-]

Some down-voted individual with "fewer rights than the star-bellied sneetches" wrote this:

higher intelligence doesn't lead necessarily to convergent moral goals

It might. However, this is also a reason for an evolutionarily-informed AGI-building process that starts off by including mirror neurons based on the most empathic and most intelligent people. Not so empathic and stupid that they embrace mass-murdering communism in an attempt to be compassionate, but empathic to the level of a smart libertarian who personally gives a lot to charity, etc., with repeated good outcomes limited only by capacity.

Eschewing mirror neurons and human brain construction entirely seems to be a mistake. Adding super-neo-cortices that recognize far more than linear patterns, once you have a benevolent "approximate human level" intelligence appears to be a good approach.

Comment author: More_Right 24 April 2014 08:04:02PM 0 points [-]

I don't know, in terms of dystopia, I think that an AGI might decide to "phase us out" prior to the singularity, if it was really malevolent. Make a bunch of attractive but sterile women robots, and a bunch of attractive but sterile male robots. Keep people busy with sex until they die of old age. A "gentle good night" abolition of humanity that isn't much worse (or way better) than what they had experienced for 50M years.

Releasing sterile attractive mates into a population is a good "low ecological impact" way of decreasing a population. Although, why would a superintelligence be opposed to _all humans? I find this somewhat unlikely, given a self-improving design.

Comment author: Stuart_Armstrong 25 April 2014 09:47:40AM 1 point [-]

This is probably not the most efficient use of the AGI's time and resources...

Comment author: More_Right 26 April 2014 09:08:42AM -1 points [-]

Probably true, but I agree with Peter Voss. I don't think any malevolence is the most efficient use of the AGI's time and resources. I think AGI has nothing to gain from malevolence. I don't think the dystopia I posited is the most likely outcome of superintelligence. However, while we are on the subject of the forms a malevolent AGI might take, I do think this is the type of malevolence most likely to be allow the malevolent AGI to retain a positive self-image.

(Much the way environmentalists can feel better about introducing sterile males into crop-pest populations, and feel better about "solving the problem" without polluting the environment.)

Ted Kaczynski worried about this scenario a lot. ...I'm not much like him in my views.

Comment author: Stuart_Armstrong 26 April 2014 09:41:56PM 0 points [-]

The most efficient use of time and resources will be to best accomplish the AI's goals. If these goals are malovent or lethally I different, so will the AI's actions. Unless these goals include maintaining a particular self image, the AI will have no seed to maintain any erroneous self image.