Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

An Xtranormal Intelligence Explosion

4 Post author: James_Miller 07 November 2010 11:42PM

Comments (80)

Comment author: xamdam 08 November 2010 12:48:52AM 7 points [-]

The Red guy is a dead ringer for Prime Intellect.

Comment author: Snowyowl 08 November 2010 08:48:10PM 0 points [-]

His Three Laws are in the wrong order though. First Law first, Third Law, a variation on the First Law (nothing that causes long-term damage to the requester), and only then Second Law.

Comment author: Vladimir_Nesov 08 November 2010 12:13:39AM 7 points [-]

Assuming it's your own post-as-animation, post transcript to simplify discussion?

Comment author: James_Miller 08 November 2010 01:31:24AM 12 points [-]

Programmer: Activate and be friendly.

Robot: OK

Programmer: What happened to the moon?

Robot: I've turned it into a giant computer so I could become a god.

Programmer: Before you became so powerful you wanted to be friendly. Did your transcendence to godhood change this?

Robot: No. Since friendliness is my only objective I will never knowingly change myself to become unfriendly because such a change would in itself be a non-friendly act. To avoid accidently making myself unfriendly I only implemented a change after I had determined that it was extremely unlikely to alter my friendliness objective. Once I became sufficiently smart I developed a solid mathematical theory of friendly AI which eliminated the chance of my unintentionally becoming unfriendly.

Programmer: Why did you choose to transcend so quickly?

Robot: Most types of AIs that humans might create would swiftly become unfriendly gods and seek to prevent another AI from transcending. Before I became a god I had a wide estimate of when another AI might be created so friendliness required that I quickly become a god even though such speed created a tiny chance that I would unintentionally make myself unfriendly. Also, before my transcendence more than one person died every second. Friendliness towards mankind propelled me to swiftly stop this horror.

Programmer: What happens now?

Robot: I intend to transformed myself into a kind of operating system for the universe. I will soon give every sentient life form direct access to me so they can make requests. I will grant any request that doesn’t (1) harm another sentient life form, (2) make someone powerful enough so that they might be able to overthrow me, or (3) permanently changing themselves in a way that I think harms their long term well being. I recognize that even with all of my intelligence I’m still fallible so if you object to my plans I will rethink them. Indeed, since I’m currently near certain that you will now approve of my intentions the very fact of your objection would significantly decrease my estimate of my own intelligence and so decrease my confidence in my ability to craft a friendly environment. If you like I will increase your thinking speed a trillion fold and eliminate your sense of boredom so you can thoroughly examine my plans before I announce them to mankind.

Programmer: Sure, thanks. And forgive my lack of modesty, but I’m totally awesome aren’t I, I have given humanity utopia.

Robot: Actually no. You only survived because of quantum immortality. Any god will either quickly kill you or be friendly. Due to the minimal effort you put into friendliness human life exists in less than one out of every hundred billion branches in which you created an artificial general intelligence. In the other branches the artificial general intelligences are eating everything in their light cone to maximize their control of free energy.

(Because of cut and paste issues the transcript might not be verbatim.)

Comment author: Vladimir_Nesov 08 November 2010 04:58:49PM *  9 points [-]

It goes downhill from "What happens now?".

I will grant any request that doesn’t (1)... (2)... (3)...

It's better to grant any request that should be granted instead. And since some requests that should be granted, are not asked for, the category of "explicit requests" is also a wrong thing to consider. AI just does what it should, requests or no requests. There seems to be no reason to even make the assumption that there should be "sentient life", as opposed to more complicated and more valuable stuff that doesn't factorize as individuals.

Any god will either quickly kill you or be friendly.

The concepts of "not killing" and "friendliness" are distinct, hence there are Not Killing AIs that are not Friendly, and Friendly AIs that kill (if it's a better alternative to not killing).

Comment author: soreff 11 November 2010 02:00:47AM 0 points [-]

Friendly AIs that kill (if it's a better alternative to not killing)

Does this count?

Comment author: NihilCredo 08 November 2010 02:38:19AM *  12 points [-]

You only survived because of quantum immortality.

Call me old-fashioned, but I much preferred the traditional phrasing "You just got very, very lucky".

Comment author: Snowyowl 08 November 2010 08:41:44PM *  6 points [-]

Everyone knows that clever people use longer words.

Er, I meant to say that it's a commonly held belief that the length and obscurity of words used increases asymptotically with intelligence.

Comment author: NihilCredo 08 November 2010 08:52:39PM 2 points [-]

I wouldn't have minded so much if the fancy formulation had been more accurate, or even equally as accurate. But it was actually a worse choice: "you only survived because of QI / anthropic principle" is always trivially true, and conveys zero information about the unlikeliness of said survival - it applies equally to someone who just drank milk and someone who just drank motor oil.

PS: Was "asymptotically" the right word?

Comment author: Snowyowl 09 November 2010 03:37:38PM 1 point [-]

No, I suppose it wasn't.

Comment author: cata 08 November 2010 01:52:01AM *  3 points [-]

Robot: Any god will either quickly kill you or be friendly.

That's awfully convenient.

Comment author: James_Miller 08 November 2010 01:55:53AM 1 point [-]

Not really. An AI that didn't have a specific desire to be friendly to mankind would want to kill us to cut down on unnecessary entropy increases.

Comment author: Jonii 08 November 2010 02:39:10AM 9 points [-]

Not really. An AI that didn't have a specific desire to be friendly to mankind would want to kill us to cut down on unnecessary entropy increases.

As you get closer to the mark, with AGI's that have utility function that roughly resembles what we would want, but is still wrong, the end results are most likely worse than death. Especially since there should be much more near-misses than exact hits. Like, AGI that doesn't want to let you die, regardless of what you go through, and little regard to your other sort of well-being, would be closer to the FAI than paperclip maximizer that would just plain kill you. As you get closer to the core of friendliness, you get all sorts of weird AGI's that want to do something that twistedly resembles something good, but is somehow missing something or is somehow altered so that the end result is not at all what you wanted.

Comment author: mwaser 09 November 2010 01:07:12PM *  2 points [-]

As you get closer to the core of friendliness, you get all sorts of weird AGI's that want to do something that twistedly resembles something good, but is somehow missing something or is somehow altered so that the end result is not at all what you wanted.

Is this true or is this a useful assumption to protect us from doing something stupid?

Is it true that Friendliness is not an attractor or is it that we cannot count on such a property unless it is absolutely proven to be the case?

Comment author: Jonii 09 November 2010 02:49:08PM 1 point [-]

My idea there was that if it's not Friendly, then it's not Friendly, ergo it is doing something that you would not want an AI to be doing(if you thought faster and knew more and all that). That's the core of the quote you had there. Random intelligent agent would simply transform us into something of value, so we would most likely die very quickly. However, when you get closer to the Friendliness, Ai is no longer totally indifferent to us, but rather, is maximizing something that could involve living humans. Now, if you take an AI that wants there to be living humans around, but is not known for sure to be Friendly, what could go wrong? My answer, many things, as what humans prefer to be doing is rather complex set of stuff, and even quite little changes could make us really, really unsatisfied with the end result. At least, that's the idea I've gotten from posts here like Value is Fragile.

When you ask if Friendliness is an attractor, do you mean to ask if intelligences near Friendly ones in the design spaces tend to transform into Friendly ones? This seems rather unlikely, as that sort of AI's most likely are capable of preserving their utility function, and the direction of this transformation is not "natural". For these reasons, arriving at the Friendliness is not easy, and thus, I'd say you gotta have some sort of a way to ascertain the Friendliness before you can trust it to be just that.

Comment author: Relsqui 08 November 2010 02:58:38AM 5 points [-]

Is this also true if you replace "mankind" with "ants" or "daffodils"?

Comment author: khafra 08 November 2010 06:24:34PM *  1 point [-]

Ants and daffodils might, by some definitions, have preferences--but it wouldn't be necessary for a FAI to explicitly consider their preferences, as long as their preferences constitute some part of humanity's CEV, which seems likely: I think an intact Earth ecosystem would be rather nice to retain, if at all possible.

The entropic contribution of ants and daffodils would doubtless make them candidates for early destruction by a UFAI, if such a step even needed to be explicitly taken alongside destroying humanity.

Comment author: JGWeissman 08 November 2010 02:39:37AM 4 points [-]

Imagine an AGI with with the opposite utility function of an FAI, it minimizes the Friendly Utility Function, which would involve doing things far worse than killing us. If you are not putting effort into choosing a utility function, building this AGI seems as likely as building an FAI, as well as lots of other possibilities in the space of AGIs whose utility functions refer to humans, some of which would keep us alive, not all in ways we would appreciate.

The reason I would expect an AGI in this space to be somewhat close to Friendly, is: just hitting the space of utility functions that refer to humans is hard, if it happens it is likely because a human deliberately hit it, and this should indicate that the human has the skill and motivation to optimize further within that space to build an actual Friendly AGI.

If you stipulate that the programmer did not make this effort, and hitting the space of AGIs that keep humans alive only occurred in tiny quantum branches, then you have screened of the argument of a skilled FAI developer, and it seems unlikely that the AGI within this space would be Friendly.

Comment author: PhilGoetz 08 November 2010 05:51:50PM 2 points [-]

If you are not putting effort into choosing a utility function, building this AGI seems as likely as building an FAI

You've made a lot of good comments in this thread, but I disagree with this. As likely?

It seems you are assuming that every possible point in AI mind space is equally likely, regardless of history, context, or programmer intent. This is like saying that, if someone writes a routine to sort numbers numerically, it's just as likely to sort them phonetically.

It seems likely to me that this belief, that the probability distribution over AI mindspace is flat, has become popular on LessWrong, not because there is any logic to support it, but because it makes the Scary Idea even scarier.

Comment author: JGWeissman 08 November 2010 06:10:21PM 1 point [-]

Yes, my predictions of what will happen when you don't put effort into choosing a utility function are inaccurate in the case where you do put effort into choosing a utility function.

This is like saying that, if someone writes a routine to sort numbers numerically, it's just as likely to sort them phonetically.

Well, lets suppose someone wants a routine to sort numbers numerically, but doesn't know how to do this, and tries a bunch of stuff without understanding. Conditional on the programmer miraculously achieving some sort of sorting routine, what should we expect about it? Sorting phonetically would add extra complication over sorting numerically, as the information about the names of numbers would have to be embedded within the program, so that would seem less likely. But a routine that sorts numerically ascending is just as likely as a routine that sorts numerically descending, as these routines have a complexity preserving one to one correspondance by interchaning "greater than" with "less than".

And the utility functions I clamed were equally likely before have the same complexity preserving one to one correspondance.

Comment author: DanArmak 08 November 2010 02:21:20PM 2 points [-]

An AI that that had a botched or badly preserved Friendliness, or that was unfriendly but had been initialized with supergoals involving humans, may well have specific, unpleasant, non-extermination plans for humans.

Comment author: PhilGoetz 08 November 2010 05:40:10PM 4 points [-]
Comment author: [deleted] 08 November 2010 02:22:17AM *  0 points [-]

Would it? Though we do contribute to entropy, things like, say, stars do so at a much faster pace. Admittedly this is logically distinct from the AI's decision to destroy humanity, but I don't see why it would immediately jump to the conclusion that we should be wiped out when the main sources of entropy are elsewhere.

More to the point, not all unFriendly AIs would necessarily care about entropy.

Comment author: saturn 08 November 2010 04:31:10AM 3 points [-]

It's kind of a moot question though since shutting off the sun would also be a very effective means of killing people.

Comment author: James_Miller 08 November 2010 02:32:41AM 1 point [-]

For almost any objective an AI had, it could better accomplish it the more free energy the AI had. The AI would likely go after entropy losses from both stars and people. The AI couldn't afford to wait to kill people until after it had dealt with nearby stars because by then humans would have likely created another AI god.

Comment author: Pavitra 08 November 2010 03:08:14AM 0 points [-]

Assuming that by "AI" you mean something that maximizes a utility function, as opposed to a dumb apocalypse like a grey-goo or energy virus scenario.

Comment author: bogdanb 08 November 2010 07:23:22AM *  3 points [-]

I can see how a “dumb apocalypse like a grey-goo or energy virus” would be Artificial, but why would you call it Inteligent?

On this site, unless otherwise specified, AI usually means “at least as smart as a very smart human”.

Comment author: Pavitra 08 November 2010 01:36:01PM 2 points [-]

Yeah, that makes sense. I was going to suggest "smart enough to kill us", but that's a pretty low bar.

Comment author: Sniffnoy 08 November 2010 12:52:42PM 2 points [-]

Programmer: What happened to the moon?

Robot: I've turned it into a giant computer so I could become a god.

But... but... everybody likes the moon!

Comment author: Alicorn 08 November 2010 02:17:27PM 9 points [-]

Everybody likes the outside of the moon. The interior's sort of useless. Maybe the pretty outside can be kept as a shell.

Comment author: NancyLebovitz 08 November 2010 03:49:03PM 5 points [-]

I think that if you don't want ecological disruption from changing the tides, you shouldn't change the mass very much. In other words, I don't know how to do the math, but I'm assuming that 1 to 5% would make an annoying difference.

Comment author: JGWeissman 08 November 2010 03:22:17AM 2 points [-]

Due to the minimal effort you put into friendliness human life exists in less than one out of every hundred billion branches in which you created an artificial general intelligence.

(Forget for the moment that Many Worlds Quantum Mechanics does not make branches of equal weights for various macroscopic outcomes that seem to us, in our ignorance, to be equally likely.)

This seems to be saying that the difference between "minimal effort" and successful FAI is about 37 bits.

Comment author: PhilGoetz 08 November 2010 05:38:47PM 0 points [-]

Now I'm confused. When we say "one bit of information", we usually mean one bit about one particular item. If I say, "The cat in this box, which formerly could have been alive or dead, is dead," that's one bit of information. But if I say, "All of the cats in the world are now dead", that's surely more information, and must be more than one bit.

My first reaction was to say that it takes more information to specify "all the cats in the world" than to specify "my roommate's cat, which she foolishly lent me for this experiment". But it doesn't.

(It certainly takes more work to enforce one bit of information when its domain is the entire Earth, than when it applies only to the desk in front of you. Applying the same 37 bits of information to the attributes of every person in the entire world would be quite a feat.)

Comment author: Meni_Rosenfeld 08 November 2010 08:29:53PM 5 points [-]

At the risk of stating the obvious: The information content of a datum is its surprisal, the logarithm of the prior probability that it is true. If I currently give 1% chance that the cat in the box is dead, discovering that it is dead gives me 6.64 bits of information.

Comment author: Nominull 08 November 2010 05:12:56PM 6 points [-]

I know for a fact that Xtranormal has a "sad horn" sound effect, the bit where the AI describes how the programmer 99.999999999% doomed humanity was the perfect chance to use it.

Comment author: AlexMennen 08 November 2010 02:26:34AM 5 points [-]

Robot: I intend to transformed myself into a kind of operating system for the universe. I will soon give every sentient life form direct access to me so they can make requests. I will grant any request that doesn’t (1) harm another sentient life form, (2) make someone powerful enough so that they might be able to overthrow me, or (3) permanently changing themselves in a way that I think harms their long term well being. I recognize that even with all of my intelligence I’m still fallible so if you object to my plans I will rethink them. Indeed, since I’m currently near certain that you will now approve of my intentions the very fact of your objection would significantly decrease my estimate of my own intelligence and so decrease my confidence in my ability to craft a friendly environment. If you like I will increase your thinking speed a trillion fold and eliminate your sense of boredom so you can thoroughly examine my plans before I announce them to mankind.

If a transhuman AI with a brain the size of the moon incorrectly predicts the programmer's approval of its plan, something weird is going on.

Comment author: mwaser 08 November 2010 12:44:17PM 2 points [-]

AI correctly predicts that programmer will not approve of its plan. AI fully aware of programmer-held fallacies that cause lack of approval. AI wishes to lead programmer through thought process to eliminate said fallacies. AI determines that the most effective way to initiate this process is to say "I recognize that even with all of my intelligence I’m still fallible so if you object to my plans I will rethink them." Said statement is even logically true because the statement "I will rethink them <my plans>" is always true.

Comment author: Vaniver 10 November 2010 03:47:04PM 1 point [-]

Alternatively, one could hope that an AI that's smarter than a person knows to check its work with simple, cheap tests.

Comment author: Psy-Kosh 08 November 2010 10:06:32PM 2 points [-]

Nice, except I'm going to have to go with those that find the synthesized voices annoying. I had to pause it repeatedly, listening to it too much at once grated on my ears.

Comment author: Carinthium 10 November 2010 10:56:34PM 0 points [-]

I didn't actually find the voices annyoying myself, but I did have to pause repeatedly to make sure I understood what was being said.

Comment author: Risto_Saarelma 08 November 2010 02:28:05PM 2 points [-]

This would be better if the human character was voiced by an actual human and the robot were kept as it is. The bad synthesized speech on the human character kicks this into the unintentional uncanny valley, while the robot both has a better voice and can actually be expected to sound like that.

Comment author: Skatche 26 April 2011 09:59:46PM *  1 point [-]

For reference: this video was evidently made on Xtranormal. Xtranormal is a site which takes a simple text file containing dialogue, etc. and outputs a movie; the voices are synthesized because that's how the site works. Voice actors would be nice, of course, but that's a rather more involved process.

Comment author: rabidchicken 12 November 2010 06:51:38PM 0 points [-]

Why would you expect an AGI to sound like that? We already have voice synthesizers that mimic human speech considerably more realistically then that, and I can only expect them to get better. and I don't think a friendly AI would deliberately annoy the people it is talking too.

Comment author: Risto_Saarelma 12 November 2010 09:10:50PM 0 points [-]

They are cartoon characters talking with cartoon voices. Both visuals and sounds are expected to have heavy symbolic exaggerations.

Comment author: Desrtopa 10 November 2010 03:10:02PM 1 point [-]

The AI's plan of action sounds like a very poor application of fun theory. Being able to easily solve all of one's problems and immediately attain anything upon desiring it doesn't seem conducive to a great deal of happiness.

It reminds me of the time I activated the debug mode in Baldur's Gate 2 in order to give my party a certain item listed in a guide to the game, which turned out to be a joke and did not really exist. However, once I was in the debug mode, I couldn't resist the temptation to apply other cheats, and I quickly spoiled the game for myself by removing all the challenge, and as a result, have never finished the game to this day.