kalla724 comments on Thoughts on the Singularity Institute (SI) - Less Wrong

256 Post author: HoldenKarnofsky 11 May 2012 04:31AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (1270)

You are viewing a single comment's thread. Show more comments above.

Comment author: JoshuaZ 17 May 2012 06:17:08PM *  4 points [-]

The thermonuclear issue actually isn't that implausible. There have been so many occasions where humans almost went to nuclear war over misunderstandings or computer glitches, that the idea that a highly intelligent entity could find a way to do that doesn't seem implausible, and exact mechanism seems to be an overly specific requirement.

Comment author: kalla724 17 May 2012 07:00:57PM *  3 points [-]

I'm not so much interested in the exact mechanism of how humans would be convinced to go to war, as in an even approximate mechanism by which an AI would become good at convincing humans to do anything.

Ability to communicate a desire and convince people to take a particular course of action is not something that automatically "falls out" from an intelligent system. You need a theory of mind, an understanding of what to say, when to say it, and how to present information. There are hundreds of kids on autistic spectrum who could trounce both of us in math, but are completely unable to communicate an idea.

For an AI to develop these skills, it would somehow have to have access to information on how to communicate with humans; it would have to develop the concept of deception; a theory of mind; and establish methods of communication that would allow it to trick people into launching nukes. Furthermore, it would have to do all of this without trial communications and experimentation which would give away its goal.

Maybe I'm missing something, but I don't see a straightforward way something like that could happen. And I would like to see even an outline of a mechanism for such an event.

Comment author: [deleted] 17 May 2012 07:40:58PM 3 points [-]

For an AI to develop these skills, it would somehow have to have access to information on how to communicate with humans; it would have to develop the concept of deception; a theory of mind; and establish methods of communication that would allow it to trick people into launching nukes. Furthermore, it would have to do all of this without trial communications and experimentation which would give away its goal.

I suspect the Internet contains more than enough info for a superhuman AI to develop a working knowledge of human psychology.

Comment author: kalla724 17 May 2012 08:09:30PM 2 points [-]

Only if it has the skills required to analyze and contextualize human interactions. Otherwise, the Internet is a whole lot of jibberish.

Again, these skills do not automatically fall out of any intelligent system.

Comment author: XiXiDu 18 May 2012 09:14:41AM 0 points [-]

I suspect the Internet contains more than enough info for a superhuman AI to develop a working knowledge of human psychology.

I don't see what justifies that suspicion.

Just imagine you emulated a grown up human mind and it wanted to become a pick up artist, how would it do that with an Internet connection? It would need some sort of avatar, at least, and then wait for the environment to provide a lot of feedback.

Therefore even if we’re talking about the emulation of a grown up mind, it will be really hard to acquire some capabilities. Then how is the emulation of a human toddler going to acquire those skills? Even worse, how is some sort of abstract AGI going to do it that misses all of the hard coded capabilities of a human toddler?

Can we even attempt to imagine what is wrong about a boxed emulation of a human toddler, that makes it unable to become a master of social engineering in a very short time?

Comment author: NancyLebovitz 18 May 2012 12:47:15PM *  2 points [-]

Humans learn most of what they know about interacting with other humans by actual practice. A superhuman AI might be considerably better than humans at learning by observation.

Comment author: [deleted] 18 May 2012 05:39:42PM *  1 point [-]

Just imagine you emulated a grown up human mind

As a “superhuman AI” I was thinking about a very superhuman AI; the same does not apply to slightly superhuman AI. (OTOH, if Eliezer is right then the difference between a slightly superhuman AI and a very superhuman one is irrelevant, because as soon as a machine is smarter than its designer, it'll be able to design a machine smarter than itself, and its child an even smarter one, and so on until the physical limits set in.)

all of the hard coded capabilities of a human toddler

The hard coded capabilities are likely overrated, at least in language acquisition. (As someone put it, the Kolgomorov complexity of the innate parts of a human mind cannot possibly be more than that of the human genome, hence if human minds are more complex than that the complexity must come from the inputs.)

Also, statistic machine translation is astonishing -- by now Google Translate translations from English to one of the other UN official languages and vice versa are better than a non-completely-ridiculously-small fraction of translations by humans. (If someone had shown such a translation to me 10 years ago and told me “that's how machines will translate in 10 years”, I would have thought they were kidding me.)

Comment author: JoshuaZ 17 May 2012 07:04:17PM 0 points [-]

Let's do the most extreme case: AI's controlers give it general internet access to do helpful research. So it gets to find out about general human behavior and what sort of deceptions have worked in the past. Many computer systems that should't be online are online (for the US and a few other governments). Some form of hacking of relevant early warning systems would then seem to be the most obvious line of attack. Historically, computer glitches have pushed us very close to nuclear war on multiple occasions.

Comment author: kalla724 17 May 2012 08:12:45PM 3 points [-]

That is my point: it doesn't get to find out about general human behavior, not even from the Internet. It lacks the systems to contextualize human interactions, which have nothing to do with general intelligence.

Take a hugely mathematically capable autistic kid. Give him access to the internet. Watch him develop ability to recognize human interactions, understand human priorities, etc. to a sufficient degree that it recognizes that hacking an early warning system is the way to go?

Comment author: JoshuaZ 17 May 2012 08:15:47PM 1 point [-]

Well, not necessarily, but an entity that is much smarter than an autistic kid might notice that, especially if it has access to world history (or heck many conversations on the internet about the horrible things that AIs do simply in fiction). It doesn't require much understanding of human history to realize that problems with early warning systems have almost started wars in the past.

Comment author: kalla724 17 May 2012 08:20:46PM 3 points [-]

Yet again: ability to discern which parts of fiction accurately reflect human psychology.

An AI searches the internet. It finds a fictional account about early warning systems causing nuclear war. It finds discussions about this topic. It finds a fictional account about Frodo taking the Ring to Mount Doom. It finds discussions about this topic. Why does this AI dedicate its next 10^15 cycles to determination of how to mess with the early warning systems, and not to determination of how to create One Ring to Rule them All?

(Plus other problems mentioned in the other comments.)

Comment author: JoshuaZ 17 May 2012 08:35:42PM 3 points [-]

There are lots of tipoffs to what is fictional and what is real. It might notice for example the Wikipedia article on fiction describes exactly what fiction is and then note that Wikipedia describes the One Ring as fiction, and that Early warning systems are not. I'm not claiming that it will necessarily have an easy time with this. But the point is that there are not that many steps here, and no single step by itself looks extremely unlikely once one has a smart entity (which frankly to my mind is the main issue here- I consider recursive self-improvement to be unlikely).

Comment author: kalla724 17 May 2012 09:40:19PM 1 point [-]

We are trapped in an endless chain here. The computer would still somehow have to deduce that Wikipedia entry that describes One Ring is real, while the One Ring itself is not.

Comment author: jacob_cannell 17 May 2012 11:06:27PM 0 points [-]

We observer that Wikipedia is mainly truthful. From that we infer "entry that describes "One Ring" is real". From use of term fiction/story in that entry, we refer that "One Ring" is not real.

Somehow you learned that Wikipedia is mainly truthful/nonfictional and that "One Ring" is fictional. So your question/objection/doubt is really just the typical boring doubt of AGI feasibility in general.

Comment author: JoshuaZ 17 May 2012 11:13:14PM *  1 point [-]

But even humans have trouble with this sometimes. I was recently reading the Wikipedia article Hornblower and the Crisis which contains a link to the article on Francisco de Miranda. It took me time and cues when I clicked on it to realize that de Miranda was a historical figure.

So your question/objection/doubt is really just the typical boring doubt of AGI feasibility in general.

Isn't Kalla's objection more a claim that fast takeovers won't happen because even with all this data, the problems of understanding humans and our basic cultural norms will take a long time for the AI to learn and that in the meantime we'll develop a detailed understanding of it, and it is that hostile it is likely to make obvious mistakes in the meantime?

Comment author: Strange7 22 May 2012 11:49:34PM -1 points [-]

Why would the AI be mucking around on Wikipedia to sort truth from falsehood, when Wikipedia itself has been criticized for various errors and is fundamentally vulnerable to vandalism? Primary sources are where it's at. Looking through the text of The Hobbit and Lord of the Rings, it's presented as a historical account, translated by a respected professor, with extensive footnotes. There's a lot of cultural context necessary to tell the difference.

Comment author: XiXiDu 17 May 2012 07:20:59PM 3 points [-]

Let's do the most extreme case: AI's controlers give it general internet access to do helpful research. So it gets to find out about general human behavior and what sort of deceptions have worked in the past.

None work reasonably well. Especially given that human power games are often irrational.

There are other question marks too.

The U.S. has many more and smarter people than the Taliban. The bottom line is that the U.S. devotes a lot more output per man-hour to defeat a completely inferior enemy. Yet they are losing.

The problem is that you won't beat a human at Tic-tac-toe just because you thought about it for a million years.

You also won't get a practical advantage by throwing more computational resources at the travelling salesman problem and other problems in the same class.

You are also not going to improve a conversation in your favor by improving each sentence for thousands of years. You will shortly hit diminishing returns. Especially since you lack the data to predict human opponents accurately.

Comment author: JoshuaZ 17 May 2012 07:40:36PM *  3 points [-]

Especially given that human power games are often irrational.

So? As long as they follow minimally predictable patterns it should be ok.

The U.S. has many more and smarter people than the Taliban. The bottom line is that the U.S. devotes a lot more output per man-hour to defeat a completely inferior enemy. Yet they are losing.

Bad analogy. In this case the Taliban has a large set of natural advantages, the US has strong moral constraints and goal constraints (simply carpet bombing the entire country isn't an option for example).

You are also not going to improve a conversation in your favor by improving each sentence for thousands of years. You will shortly hit diminishing returns. Especially since you lack the data to predict human opponents accurately.

This seems like an accurate and a highly relevant point. Searching a solution space faster doesn't mean one can find a better solution if it isn't there.

Comment author: kalla724 17 May 2012 08:14:39PM 3 points [-]

This seems like an accurate and a highly relevant point. Searching a solution space faster doesn't mean one can find a better solution if it isn't there.

Or if your search algorithm never accesses relevant search space. Quantitative advantage in one system does not translate into quantitative advantage in a qualitatively different system.

Comment author: XiXiDu 18 May 2012 10:28:59AM *  2 points [-]

The U.S. has many more and smarter people than the Taliban. The bottom line is that the U.S. devotes a lot more output per man-hour to defeat a completely inferior enemy. Yet they are losing.

Bad analogy. In this case the Taliban has a large set of natural advantages, the US has strong moral constraints and goal constraints (simply carpet bombing the entire country isn't an option for example).

I thought it was a good analogy because you have to take into account that an AGI is initially going to be severely constrained due to its fragility and the necessity to please humans.

It shows that a lot of resources, intelligence and speed does not provide a significant advantage in dealing with large-scale real-world problems involving humans.

Especially given that human power games are often irrational.

So? As long as they follow minimally predictable patterns it should be ok.

Well, the problem is that smarts needed for things like the AI box experiment won't help you much. Because convincing average Joe won't work by making up highly complicated acausal trade scenarios. Average Joe is highly unpredictable.

The point is that it is incredible difficult to reliably control humans, even for humans who have been fine-tuned to do so by evolution.

Comment author: jacob_cannell 18 May 2012 11:00:54AM *  1 point [-]

The Taliban analogy also works the other way (which I invoked earlier up in this thread). It shows that a small group with modest resources can still inflict disproportionate large scale damage.

The point is that it is incredible difficult to reliably control humans, even for humans who have been fine-tuned to do so by evolution.

There's some wiggle room in 'reliably control', but plain old money goes pretty far. An AI group only needs a certain amount of initial help from human infrastructure, namely to the point where it can develop reasonably self-sufficient foundries/data centers/colonies. The interactions could be entirely cooperative or benevolent up until some later turning point. The scenario from the Animatrix comes to mind.

Comment author: Strange7 22 May 2012 11:52:13PM 1 point [-]

Animatrix

That's fiction.

Comment author: Mass_Driver 17 May 2012 07:55:51PM 1 point [-]

One interesting wrinkle is that with enough bandwidth and processing power, you could attempt to manipulate thousands of people simultaneously before those people have any meaningful chance to discuss your 'conspiracy' with each other. In other words, suppose you discover a manipulation strategy that quickly succeeds 5% of the time. All you have to do is simultaneously contact, say, 400 people, and at least one of them will fall for it. There are a wide variety of valuable/dangerous resources that at least 400 people have access to. Repeat with hundreds of different groups of several hundred people, and an AI could equip itself with fearsome advantages in the minutes it would take for humanity to detect an emerging threat.

Note that the AI could also run experiments to determine which kinds of manipulations had a high success rate by attempting to deceive targets over unimportant / low-salience issues. If you discovered, e.g., that you had been tricked into donating $10 to a random mayoral campaign, you probably wouldn't call the SIAI to suggest a red alert.

Comment author: kalla724 17 May 2012 08:17:05PM 2 points [-]

Doesn't work.

This requires the AI to already have the ability to comprehend what manipulation is, to develop manipulation strategy of any kind (even one that will succeed 0.01% of the time), ability to hide its true intent, ability to understand that not hiding its true intent would be bad, and the ability to discern which issues are low-salience and which high-salience for humans from the get-go. And many other things, actually, but this is already quite a list.

None of these abilities automatically "fall out" from an intelligent system either.

Comment author: JoshuaZ 17 May 2012 09:12:07PM 0 points [-]

The problem isn't whether they fall out automatically so much as, given enough intelligence and resources, does it seem somewhat plausible that such capabilities could exist. Any given path here is a single problem. If you have 10 different paths each of which are not very likely, and another few paths that humans didn't even think of, that starts adding up.

Comment author: kalla724 17 May 2012 09:50:01PM 3 points [-]

In the infinite number of possible paths, the percent of paths we are adding up to here is still very close to zero.

Perhaps I can attempt another rephrasing of the problem: what is the mechanism that would make an AI automatically seek these paths out, or make them any more likely than infinite number of other paths?

I.e. if we develop an AI which is not specifically designed for the purpose of destroying life on Earth, how would that AI get to a desire to destroy life on Earth, and by which mechanism would it gain the ability to accomplish its goal?

This entire problem seems to assume that an AI will want to "get free" or that its primary mission will somehow inevitably lead to a desire to get rid of us (as opposed to a desire to, say, send a signal consisting of 0101101 repeated an infinite number of times in the direction of Zeta Draconis, or any other possible random desire). And that this AI will be able to acquire the abilities and tools required to execute such a desire. Every time I look at such scenarios, there are abilities that are just assumed to exist or appear on their own (such as the theory of mind), which to the best of my understanding are not a necessary or even likely products of computation.

In the final rephrasing of the problem: if we can make an AGI, we can probably design an AGI for the purpose of developing an AGI that has a theory of mind. This AGI would then be capable of deducing things like deception or the need for deception. But the point is - unless we intentionally do this, it isn't going to happen. Self-optimizing intelligence doesn't self-optimize in the direction of having theory of mind, understanding deception, or anything similar. It could, randomly, but it also could do any other random thing from the infinite set of possible random things.

Comment author: TheOtherDave 17 May 2012 10:05:39PM 1 point [-]

Self-optimizing intelligence doesn't self-optimize in the direction of having theory of mind, understanding deception, or anything similar. It could, randomly, but it also could do any other random thing from the infinite set of possible random things.

This would make sense to me if you'd said "self-modifying." Sure, random modifications are still modifications. But you said "self-optimizing."
I don't see how one can have optimization without a goal being optimized for... or at the very least, if there is no particular goal, then I don't see what the difference is between "optimizing" and "modifying."

If I assume that there's a goal in mind, then I would expect sufficiently self-optimizing intelligence to develop a theory of mind iff having a theory of mind has a high probability of improving progress towards that goal.

How likely is that?
Depends on the goal, of course.
If the system has a desire to send a signal consisting of 0101101 repeated an infinite number of times in the direction of Zeta Draconis, for example, theory of mind is potentially useful (since humans are potentially useful actuators for getting such a signal sent) but probably has a low ROI compared to other available self-modifications.

At this point it perhaps becomes worthwhile to wonder what goals are more and less likely for such a system.

Comment author: Polymeron 20 May 2012 05:07:25PM *  0 points [-]

An experimenting AI that tries to achieve goals and has interactions with humans whose effects it can observe, will want to be able to better predict their behavior in response to its actions, and therefore will try to assemble some theory of mind. At some point that would lead to it using deception as a tool to achieve its goals.

However, following such a path to a theory of mind means the AI would be exposed as unreliable LONG before it's even subtle, not to mention possessing superhuman manipulation abilities. There is simply no reason for an AI to first understand the implications of using deception before using it (deception is a fairly simple concept, the implications of it in human society are incredibly complex and require a good understanding of human drives).

Furthermore, there is no reason for the AI to realize the need for secrecy in conducting social experiments before it starts doing them. Again, the need for secrecy stems from a complex relationship between humans' perception of the AI and its actions; a relationship it will not be able to understand without performing the experiments in the first place.

Getting an AI to the point where it is a super manipulator requires either actively trying to do so, or being incredibly, unbelievably stupid and blind.

Comment author: JoshuaZ 17 May 2012 09:57:19PM 0 points [-]

In most such scenarios, the AI doesn't have a terminal goal of getting rid of us, but rather have it as a subgoal that arises from some larger terminal goal. The idea of a "paperclip maximizer" is one example- where a hypothetical AI is programmed to maximize the number of paperclips and then proceeds to try to do so throughout its future light cone.

If there is an AI that is interacting with humans, it may develop a theory of mind simply due to that. If one is interacting with entities that are a major part of your input, trying to predict and model their behavior is a straightforward thing to do. The more compelling argument in this sort of context would seem to me to be not that an AI won't try to do so, but just that humans are so complicated that a decent theory of mind will be extremely difficult. (For example, when one tries to give lists of behavior and norms for austic individuals one never manages to get a complete list, and some of the more subtle ones, like sarcasm are essentially impossible to convey in any reasonable fashion).

I don't also know how unlikely such paths are. A 1% or even a 2% chance of existential risk would be pretty high compared to other sources of existential risk.

Comment author: XiXiDu 18 May 2012 08:59:23AM *  1 point [-]

All you have to do is simultaneously contact, say, 400 people, and at least one of them will fall for it.

But at what point does it decide to do so? It won't be a master of dark arts and social engineering from the get-go. So how does it acquire the initial talent without making any mistakes that reveal its malicious intentions? And once it became a master of deception, how does it hide the rough side effects of its large scale conspiracy, e.g. its increased energy consumption and data traffic? I mean, I would personally notice if my PC suddenly and unexpectedly used 20% of my bandwidth and the CPU load would increase for no good reason.

You might say that a global conspiracy to build and acquire advanced molecular nanotechnology to take over the world doesn't use much resources and they can easily be cloaked as thinking about how to solve some puzzle, but that seems rather unlikely. After all, such a large scale conspiracy is a real-world problem with lots of unpredictable factors and the necessity of physical intervention.

Comment author: jacob_cannell 18 May 2012 10:49:38AM 0 points [-]

All you have to do is simultaneously contact, say, 400 people, and at least one of them will fall for it.

But at what point does it decide to do so? It won't be a master of dark arts and social engineering from the get-go. So how does it acquire the initial talent without making any mistakes that reveal its malicious intentions?

Most of your questions have answers that follow from asking analogous questions about past human social engineers, ie Hitler.

Your questions seem to come from the perspective that the AI will be some disembodied program in a box that has little significant interaction with humans.

In the scenario I was considering, the AI's will have a development period analogous to human childhood. During this childhood phase the community of AIs will learn of humans through interaction in virtual video game environments and experiment with social manipulation, just as human children do. The latter phases of this education can be sped up dramatically as the AI's accelerate and interact increasingly amongst themselves. The anonymous nature of virtual online communites makes potentially dangerous, darker experiments much easier.

However, the important questions to ask are not of the form: how would these evil AIs learn how to manipulate us while hiding their true intentions for so long? but rather how could some of these AI children which initially seemed so safe later develop into evil sociopaths?

Comment author: Polymeron 20 May 2012 05:16:31PM 0 points [-]

I would not consider a child AI that tries a bungling lie at me to see what I do "so safe". I would immediately shut it down and debug it, at best, or write a paper on why the approach I used should never ever be used to build an AI.

And it WILL make a bungling lie at first. It can't learn the need to be subtle without witnessing the repercussions of not being subtle. Nor would have a reason to consider doing social experiments in chat rooms when it doesn't understand chat rooms and has an engineer willing to talk to it right there. That is, assuming I was dumb enough to give it an unfiltered Internet connection, which I don't know why I would be. At very least the moment it goes on chat rooms my tracking devices should discover this and I could witness its bungling lies first hand.

(It would not think to fool my tracking device or even consider the existence of such a thing without a good understanding of human psychology to begin with)