[Template] Questions regarding possible risks from artificial intelligence

XiXiDu

LESSWRONG
LW

[Template] Questions regarding possible risks from artificial intelligence — LessWrong

10 [Template] Questions regarding possible risks from artificial intelligence

by XiXiDu

10th Jan 2012

1 min read

10

I am emailing experts in order to raise and estimate the academic awareness and perception of risks from AI. Below are some questions I am going to ask. Please help to refine the questions or suggest new and better questions.

(Thanks goes to paulfchristiano, Steve Rayhawk and Mafred.)

Q1: Assuming beneficially political and economic development and that no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of artificial intelligence that is roughly as good as humans at science, mathematics, engineering and programming?

Q2: Once we build AI that is roughly as good as humans at science, mathematics, engineering and programming, how much more difficult will it be for humans and/or AIs to build an AI which is substantially better at those activities than humans?

Q3: Do you ever expect artificial intelligence to overwhelmingly outperform humans at typical academic research, in the way that they may soon overwhelmingly outperform humans at trivia contests, or do you expect that humans will always play an important role in scientific progress?

Q4: What probability do you assign to the possibility of an AI with initially (professional) human-level competence at general reasoning (including science, mathematics, engineering and programming) to self-modify its way up to vastly superhuman capabilities within a matter of hours/days/< 5 years?

Q5: How important is it to figure out how to make superhuman AI provably friendly to us and our values (non-dangerous), before attempting to build AI that is good enough at general reasoning (including science, mathematics, engineering and programming) to undergo radical self-modification?

Q6: What probability do you assign to the possibility of human extinction as a result of AI capable of self-modification (that is not provably non-dangerous, if that is even possible)?

Personal Blog

10

New Comment

48 comments, sorted by

top scoring

Click to highlight new comments since: Today at 12:50 PM

[-]lukeprog14y80

My preferred rewrite, without spending too much time on it:

Q1a: Assuming no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of human-level machine intelligence? Feel free to answer ‘never’ if you believe such a milestone will never be reached. [reason: this matches question #1 of FHI's machine intelligence survey.]

Q1b: Once we build AIs that are as skilled at technology design and general reasoning as humans are, how much more difficult will it be for humans and/or AIs to build an AI which is substantially better than humans at technology design and general engineering?

Q2a: Do you ever expect AIs to overwhelmingly outperform humans at typical academic research, in the way that they may soon overwhelmingly outperform humans at trivia contests, or do you expect that humans will always play an important role in scientific progress?

Q2b: [delete to make questions list less dauntingly long, and increase response rate]

Q2c: What probability do you assign to the possibility of an AI with initially (professional) human-level competence at technology design and general reasoning to use its capacities to self-modify its way up to vastly superhuman general capabilities within a matter of hours/days/< 5 years? ('Self modification' may include the first AI creating improved child AIs, which create further-improved child AIs, etc.)

Q3a: How important is it to figure out how to make superhuman AI provably friendly to us and our values (non-dangerous), before attempting to build AI that is good enough at technology design and general reasoning to undergo radical self-modification?

Q3b: What probability do you assign to the possibility of human extinction as a result of AI capable of self-modification (that is not provably non-dangerous, if that is even possible)?

Q3c: [delete to reduce length of questions list]

Q4: [delete to reduce length of questions list]

Q5: [delete to reduce length of questions list; AI experts are unlikely to be experts on other x-risks]

Q6: [delete to reduce length of questions list; I haven't seen, and don't anticipate, useful answers here]

Q7: [delete to reduce length of questions list]

[-]timtyler14y40

I endorse the "question deletion" idea.

[-]fubarobfusco14y30

human-level machine intelligence

AIs that are as skilled at technology design and general reasoning as humans are

Are these two expressions supposed (or assumed) to be equivalent?

[-]XiXiDu14y00

I updated the original post. Maybe we could agree on those questions. Be back tomorrow.

[-]lukeprog14y10

I stand by my preferred rewrites above, but of course it's up to you.

[-]jhuffman14y00

I agree with deleting Q5 and Q6 because not only would I not expect useful responses but also because it may come off as "extremist" if any respondents are not already familiar with UFAI concepts (or if they are familiar and overtly dismissive of them).

[-]timtyler14y70

You get a lot of "human level - WTF" comments.

To avoid those, perhaps you could say what you actually mean:

More than "100" on IQ tests, pass the Turing test - or whatever.

[-]wedrifid14y40

To avoid those, perhaps you could say what you actually mean. 100 on IQ tests, pass the Turing test - or whatever.

IQ tests seem to be tests of how well humans can do things that you would already expect a computer to be better at! The difficult part seems to be parsing the question and translating it from natural language into a format the computer can tackle. No mean feat but not one requiring general intelligence! I'm not entirely sure it would be a more difficult task than having an everyday conversation at the level of a 70 IQ human. (This isn't to say that 'pass for human' is at all equivalent to 'human level' either.)

"About as good as an average intelligence but well trained human is at doing scientific research" seems to be approximately what 'human level' is getting at.

[-]timtyler14y00

IQ tests seem to be tests of how well humans can do things that you would already expect a computer to be better at!

Maybe. Machines can outperform humans in some parts of IQ tests.

...but they don't get good scores overall yet.

An IQ 100 machine would be quite something. An IQ 150 machine would be even more interesting.

[-]Vaniver14y50

What I would put as question 1 (with three parts):

(a) What does the (concept/phrase) of "human-level machine intelligence" mean to you? (b) What forms of machine intelligence are you most optimistic about? (c) What forms do you think could be the most dangerous?

Rationale: it seems to me that the most useful part of Nilsson's response was his alternate definition of human-level intelligence. Moving AI experts from the ridiculous mode of "what probability do you place on Terminator occurring?" to the serious mode of "what could go wrong with a potential design?" both signals your seriousness as a thinker and primes them to take AI risks seriously, since they came up with the doomsday scenario. It also seems like getting a sense of what direction AI experts think AI will take is useful: if experts are optimistic about machine intelligence hardware design, then FOOMing is more likely. (It might even be useful to ask about areas they're pessimistic about, since that's a different question than danger, but four questions for the first question seems like a bit much.)

Drawback: what you're interested in is cross-domain optimization competence. If people give you numbers based on when machine intelligence will be able to pass a Turing test, those numbers will be mostly useless. Even the numbers Nilsson gave for his 'employable AI' are difficult to compare to numbers other people are giving. But it seems to me that knowing better what they mean is more important than easy comparisons.

Overall, I feel a bit better about lukeprog's rewrite than I do about the original. I do think at least one question about AI risk countermeasures should be preserved- probably something like this:

Q4. What is the ideal level of awareness and funding for AI risk reduction?

Possibly with a clarification that they can either give a dollar number or just compare it to some other cause (like a particular variety of cancer, other existential risks, etc.).

[-]multifoliaterose14y40

The whole of question 3 seems problematic to me.

Concerning parts (a) and (b), I doubt that researchers will know what you have in mind by "provably friendly." For that matter I myself don't know what you have in mind by "probably friendly" despite having read a number of relevant posts on Less Wrong.

Concerning part (c); I doubt that experts are thinking in terms of money needed to possible mitigate AI risks at all; presumably in most cases if they saw this as a high priority issue and tractable issue they would have written about it already.

[-]Antisuji14y70

Not only that, 3b seems to presuppose that the only dangerous AI is a recursively self-improving one.

[-]AlexMennen14y30

Q6 is confusing. Are you asking for P(human extinction by AI that is capable of self-modification and not provably non-dangerous), P(human extinction by AI | AI capable of self-modification and not provably non-dangerous is created), or P(human extinction by first such AI | AI capable of self-modification and not provably non-dangerous is created)?

[-]siodine14y30

I think you should replace "superhuman AI" with something like "a singular AI capable of having a catastrophic global impact." Anything that isn't sourced from nerd culture, basically. I also preferred "provably non-dangerous" to "provably friendly."

[-]Bugmaster14y20

Q2 is too nebulous. What do you mean by "how much more difficult" ? How do you measure "difficulty" ?

Q5 glosses over the main problem: we don't know what "our values" even are. There's wide disagreement on this topic among practically all communities.

Q6 is not entirely clear on whether you're asking for cumulative probability, or a single random variable. You also do not define what "extinction" is.

[-]timtyler14y10

Q4: What probability do you assign to the possibility of an AI with initially (professional) human-level competence at general reasoning (including science, mathematics, engineering and programming) to self-modify its way up to vastly superhuman capabilities within a matter of hours/days/< 5 years?

I am not sure whether anyone thinks that is true. If you look at the claims by E. Yudkowsky they typically say something like:

I think that at some point in the development of Artificial Intelligence, we are likely to see a fast, local increase in capability - "AI go FOOM".

Yudkowsky appears to be hedging his bets on when this is going to happen - by saying: "at some point". There's not much sign of anything like: "initially (professional) human-level competence".

Does anyone believe such a thing will happen then? At first glance the claim makes little sense: we already know how fast progress goes with agents of "professional human-level competence" knocking around - and it just isn't that fast.

[-]Dan_Moore14y00

Q1: I think 'beneficially' should be 'beneficial'.

[-]Plasmon14y00

You might want to include some context, especially about why you think AIs might pose a threat. Yes, there are reasons to not do this but some people seem to not have considered the issue at all, or immediately jump to sci-fi tropes involving robots with human-like desires for power/revenge/...

[-]Vaniver14y10

The main concern I would have here is emailing a busy expert to say "here's a bunch of background material you may or may not be familiar with, please read it then answer some questions" seems like a poor way to get responses.

[-]Thomas14y-40

Q1a: Assuming beneficially political and economic development ...

2018/2025/2045

Q1b: Once our understanding of AI ...

Not difficult at all. It follows nearly automatically.

Q2a: Do you ever expect automated systems ...

Science is to important to be left to humans. Those systems will outperforms humans, of course. By a LARGE margin.

Q2b: To what extent does human engineering ...

Not to a great extent. Can be done from scratch if need be.

Q2c: What probability do you assign ...

10%/40%/99%

Q3a: Is it important to figure out how to make AI provably friendly

It will be no time to prove it mathematically in advance.

Q3b: What probability do ....

99% that humans will be wiped out. We may survive as non humans - 50%.

Q3c: How much money ...

Billion, maybe.

Q4: What is the current level of awareness ...

I am glad, that there is no mass hysteria about it.

Q5: ... Do possible risks from AI outweigh

Yes.

Q6: Can you think of any milestone...

Several.

Q7: How much have you read about the formal concepts ...

Quite a bit.

[-]TimS14y20

Q1b: Once our understanding of AI ... [make regular AI --> FOOM]

Not difficult at all. It follows nearly automatically.

As a FOOM skeptic, can I ask you to show your reasoning a little more? Thinking faster is great, but there's a lower bound on the time it takes to solve certain types of hard problems.

Q3a: Is it important to figure out how to make AI provably friendly

It will be no time to prove it mathematically in advance.

Wait, what? At the very least, consider the implications of the chronophone

Off-topic

Q1a: Assuming beneficially political and economic development ... [when human AI]

2018/2025/2045

This intuition may be wrong, but if I thought there was a 50% chance of GAI (human level) by 2025, I'd estimate a 10% chance essentially now (2012-2014). I guess this shows that our estimated shape of the probability distribution (what we think sigma is) is very different. Interesting.

[-]Thomas14y00

As a FOOM skeptic, can I ask you to show your reasoning a little more?

The next FOOM will be only the faster phase of the already functioning one. The one from the primordial Earth to now. Or the one from the Big Bang to now.

Nothing new, except the speed.

[-]TimS14y20

Respectfully, "like the Big Bang, only faster" does nothing to answer my question. I'm hardly committed to believing AI will go FOOM based on my belief in the Big Bang. Likewise with my belief in the evolution of life on Earth.

[-]Thomas14y-10

Not "like Big Bang, only faster", but "like from Big Bang to today, only faster" or "like from the Roman Empire to today, only faster". Or "like from the first cell to an ape, only faster".

How fast a transformation goes is a matter of a degree inside the possible physics. But if something "evolves" very fast, you can call it FOOM more easily. Only that.

Now, what make me think, that some "intelligent" program could change its hardware as well? And fast!?

Be cause there is no real dichotomy here. Every bit has its physical imprint and every calculation is also a physical process. Nothing forbids quite a large influence onto surrounding matter and a positive feedback.

Does it?

[-]asr14y20

Be cause there is no real dichotomy here. Every bit has its physical imprint and every calculation is also a physical process. Nothing forbids quite a large influence onto surrounding matter and a positive feedback.

Yes, there are things that forbid this. Typically when we design a CPU, one of the design requirements is that no sequence of instructions can alter the hardware in irreversible ways. A reset should really put it back to a consistent state. Yes, it's possible that the hardware has the potential for unexpected alteration from software, but I wouldn't bet on that as a magic capability without real evidence. It takes a lot of energy to alter silicon and digital logic circuits just don't have that kind of power.

So, given a correctly-designed CPU, any positive-feedback loop here has to go off-chip, which usually means "through humans". And humans are slow and error-prone, so that imposes a lot of lag in the feedback loop.

I believe that a human-machine system will steadily improve over time. But it doesn't seem, based on past experience, that there's unlimited positive feedback. We've hit limits in hardware performance, despite using sophisticated machines and algorithms for design. We've hit limits in software performance -- some problems really are intractable and others are undecidable.

So where's the evidence that a single software program can improve its capabilities in an uncontrolled fashion, much more quickly than the surrounding society?

[-]TheOtherDave14y00

Just to make sure I understand you: if A is a program that has full access to its source code and the specifications of the hardware it's running on, and A designs a new machine infrastructure and applies pressure to the world (e.g., through money or blackmail or whatever works) to induce humans to build an instance of that machine, B, such that B allows software-mediated hardware modification (for example, by having an automated chip-manufacturing plant attached to it), you would say that B is an "incorrectly-designed" CPU that might allow for a positive feedback loop.

Is that right?

Put a different way: this argument assumes that the existence of intelligent software doesn't alter our predictions that CPUs will all be "correctly designed." That might be true, or might not be.

[-]asr14y20

No, this is not a case of an incorrectly designed CPU. This is a case where there's a human in the loop and where the process of evolution will therefore be slow. It's not a FOOM if it takes years between improvements, during which time the rest of the world is also improving.

We are very far from having a wholly-automated CPU-builder-plus-machine-assembly-and-install system. This is not a process that I expect a mildly-superhuman intelligence to be able to short-circuit.

[-]TheOtherDave14y00

Ah, OK.

Agreed that IF it turns out that existing hardware is incapable of supporting software capable of designing a wholly automated chip factory, THEN humans are a necessary part of the self-improvement cycle for as many iterations as it takes to come up with hardware that is capable of that (plus one final iteration).

I'm not as confident of that premise as you sound, but it's certainly possible.

[-]asr14y20

Existing hardware might be capable of supporting software capable of designing an automated chip factory. But the assumption required for the FOOM scenario is much stronger than that.

To get an automated self-improving system, it's not enough to design -- you have to actually build. And the necessary factory has to build a lot more than chips. I'm certain that existing hardware attached to general purpose computers is insufficient to build much of anything. And the sort of robotic actuators required to build a wholly automated factory are pretty far from what's available today. There's really a lot of manufacturing required to get from clever software to a flexible robotic factory.

I am skeptical that these steps can be done all that quickly or that a merely superhuman AI won't make costly mistakes along the way. There are lots and lots of details to get right and the AI won't typically have access to all the relevant facts.

[-]wedrifid14y00

To get an automated self-improving system, it's not enough to design -- you have to actually build. And the necessary factory has to build a lot more than chips.

At least you need to build eventually. That's after you've harvested the resources you can from the internet. Which is a lot. ie. All the early iterations would probably just be software improvements. Hardware improvements can wait until the self improving system is already smart enough to make such tasks simple.

[-]asr14y-10

How do you know how much scope there is for software-only optimization? If I understand right, you are assuming that an AGI is able to reliably write the code for a much more capable AGI.

I'm sure this isn't true in general. At some point you max out the hardware. Before you get to that point, I'd expect the amount of cleverness needed to find more improvements exceeds the ability of the machine. Intractable problems stay intractable no matter how smart you are.

Just how much room do you think there is for iterative software-only reengineering of an AGI, and why?

[-]Thomas14y-10

Not every software, of course not. But a complex enough, that can search through the space of all possibilities fast enough to find a hole, if there is one.

Nobody thought, that in a chess a king with two knights is doomed against a king with two bishops. The most brilliant human minds have never suspected that. Then a simple software program found this hole in the FIDE's rules of "50 moves without check". The million or so best human minds haven't. People are able to explore only a small part of the solutions space.

[-]wedrifid14y60

Nobody thought, that in a chess a king with two knights is doomed against a king with two bishops. The most brilliant human minds have never suspected that.

I'm trying to find a reference for that but I can't find any mention of that endgame. Do you have a reference or maybe another detail which could narrow the google search down?

Then a simple software program found this hole in the FIDE's rules of "50 moves without check".

Isn't the 50 move rule "50 moves without a pawn moved or a piece captured"? Just requiring a check wouldn't (always) prevent the problem the rule is trying to prevent.

[-]Thomas14y-10

here

Quote:

There are some long general theoretical wins with only a two- or three-point material advantage but the fifty-move rule usually comes into play because of the number of moves required: two bishops versus a knight (66 moves); a queen and bishop versus two rooks (two-point material advantage, can require 84 moves); a rook and bishop versus a bishop on the opposite color and a knight (a two-point material advantage, requires up to 98 moves); and a rook and bishop versus two knights (two-point material advantage, but it requires up to 222 moves) (Müller & Lamprecht 2001:400–6) (Nunn 2002a:325–29).

It is almost all I find online. But I will keep try.

The "50 rule" changed several times.

[-]wedrifid14y00

This doesn't seem to mention two knights vs two bishops. Is that specifically something you recall seeing elsewhere?

[-]Thomas14y00

I've read this about 25 years ago in a magazine.

But do try this:

and this

[-]wedrifid14y00

But do try this:

Google? Yes, I tried that. I found no confirmation. I still haven't found said confirmation. I now doubt the claim.

[-]Thomas14y-20

Who do you think proved this? A human? Do you have a supporting link?

Do you think it isn't proven?

[-]wedrifid14y00

Who do you think proved this? A human? Do you have a supporting link?

If there was such a proof it would have been found by a computer.

Do you think it isn't proven?

I initially just believed you and wanted to find out more. But it turns out there isn't any mention of it in the places where I expected it to be mentioned. A winning endgame between a combination so similar in material would almost certainly be mentioned if it existed. Absence of evidence (that should exist) is evidence of absence! Perhaps there was another similar result in the magazine?

The most interesting endgame I found in my searching was two knights vs king and pawn, which is (depending on the pawn) a win. This is in contrast to the knights vs the lone king which is an easy draw. On a related (better to be worse) note there was a high ranked game in which a player underpromoted (pawns to knights) twice in one game and in each case the underpromotion was the unambiguous correct play.

[-]Thomas14y00

Here

Somebody recalls a slightly different version than I.

  FSR: Incidentally, knights really suck on b7, e.g., Soltis vs A J Goldsby, 1981, so driving your opponent's knight there tends to be a good thing. If you're defending the endgame of two bishops versus knight, disregard the above advice, since there the various "N2" squares (b7, g7, b2, and g2) are the key squares the knight should occupy. See P Popovic vs Korchnoi, 1984. (Computers proved 20 years ago that that ending is a theoretical win - though it's very difficult, see Timman vs Speelman, 1992.)

[-]prase14y40

I second wedrifid's request, please provide a link to the two knights against two bishops problem. It sounds interesting. Also, it's indeed not "50 moves without check" but rather "50 moves without a capture or a pawn move".

[-]asr14y10

Sure. Machines are good at systematically checking cases and at combinatorial optimization, once the state space is set up properly. But this isn't a good model for general-purpose intelligence. In fact, this sort of systematic checking is precisely why I think we can build correct hardware.

The way systematic verification works is that designers write a specification and then run moderately-complex programs to check that the design meets the spec. Model-checking software or hardware doesn't require a general-purpose intelligence. It requires good algorithms and plenty of horsepower, but nothing remotely self-modifying or even particularly adaptive.

[-]TimS14y20

"like from Big Bang to today, only faster"

Yes, that's basically what going FOOM means. Why do you think it will happen?

Nothing forbids quite a large influence onto surrounding matter and a positive feedback.

Well, that's not true. Many computational problems have well understood upper limits on how fast they can be solved. If you make those problem sufficiently large, they are just as intractable to a fast computer as to a smart human. You seem to think that "sufficiently large" is not a likely size of problems we will want want to solve in the future. Why do you think that?

[-]Thomas14y-10

what going FOOM means

It means, that maybe a self optimizing program will first only recompile itself more optimally. Then it will make itself parallel. Then it will find a way to level the voltage. Then it will find undocumented OPs. Then it will harness some quantum effects in the processor or in RAM or elsewhere to get a boost. Then it will outsource itself to the neighboring devices. Then it will do some small changes on the "quantum level".

Soon we will call it - a FOOMer.

Many computational problems have well understood upper limits on how fast they can be solved.

On a given hardware. Another reason it may want to FOOM a little.

[-]TimS14y00

Many computational problems have well understood upper limits on how fast they can be solved.

On a given hardware.

Again, this is not what I mean..

Please note that I'm asking WHY you think your assertions are true.

[-]Thomas14y-10

I thought it was clear. A program, which goal is only to improve itself, as much as possible, when advanced enough, CAN influence its hardware. I don't know exactly what would be the best way to do it, but I imagine that some tinkering with the electrical currents inside the CPU might alter it on a nondestructive way as well.

The "well understood upper limit" of the PI calculating will wait for an improved hardware. Improved with the whole Earth, for example.

Search lesswrong.com and Yudkowsky about this, it is one of a few things I agree with them.

[-]lukeprog14y10

XiXiDu posted these questions for the purpose of getting feedback on how to revise them.

But since you answered the questions: Are you an AI expert? What is your full name? Is your CV available online?

[-]Thomas14y10

I don't want to be regarded as an "AI expert". Especially not one of those, who are interviewed by XiXiDu.

I just read and post here, from time to time.

Still, you can follow the link in my profile and judge for yourself, what do we do.

Moderation Log