[Template] Questions regarding possible risks from artificial intelligence
I am emailing experts in order to raise and estimate the academic awareness and perception of risks from AI. Below are some questions I am going to ask. Please help to refine the questions or suggest new and better questions.
(Thanks goes to paulfchristiano, Steve Rayhawk and Mafred.)
Q1: Assuming beneficially political and economic development and that no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of artificial intelligence that is roughly as good as humans at science, mathematics, engineering and programming?
Q2: Once we build AI that is roughly as good as humans at science, mathematics, engineering and programming, how much more difficult will it be for humans and/or AIs to build an AI which is substantially better at those activities than humans?
Q3: Do you ever expect artificial intelligence to overwhelmingly outperform humans at typical academic research, in the way that they may soon overwhelmingly outperform humans at trivia contests, or do you expect that humans will always play an important role in scientific progress?
Q4: What probability do you assign to the possibility of an AI with initially (professional) human-level competence at general reasoning (including science, mathematics, engineering and programming) to self-modify its way up to vastly superhuman capabilities within a matter of hours/days/< 5 years?
Q5: How important is it to figure out how to make superhuman AI provably friendly to us and our values (non-dangerous), before attempting to build AI that is good enough at general reasoning (including science, mathematics, engineering and programming) to undergo radical self-modification?
Q6: What probability do you assign to the possibility of human extinction as a result of AI capable of self-modification (that is not provably non-dangerous, if that is even possible)?
Loading…
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Comments (48)
You get a lot of "human level - WTF" comments.
To avoid those, perhaps you could say what you actually mean:
More than "100" on IQ tests, pass the Turing test - or whatever.
IQ tests seem to be tests of how well humans can do things that you would already expect a computer to be better at! The difficult part seems to be parsing the question and translating it from natural language into a format the computer can tackle. No mean feat but not one requiring general intelligence! I'm not entirely sure it would be a more difficult task than having an everyday conversation at the level of a 70 IQ human. (This isn't to say that 'pass for human' is at all equivalent to 'human level' either.)
"About as good as an average intelligence but well trained human is at doing scientific research" seems to be approximately what 'human level' is getting at.
Maybe. Machines can outperform humans in some parts of IQ tests.
...but they don't get good scores overall yet.
An IQ 100 machine would be quite something. An IQ 150 machine would be even more interesting.
What I would put as question 1 (with three parts):
(a) What does the (concept/phrase) of "human-level machine intelligence" mean to you? (b) What forms of machine intelligence are you most optimistic about? (c) What forms do you think could be the most dangerous?
Rationale: it seems to me that the most useful part of Nilsson's response was his alternate definition of human-level intelligence. Moving AI experts from the ridiculous mode of "what probability do you place on Terminator occurring?" to the serious mode of "what could go wrong with a potential design?" both signals your seriousness as a thinker and primes them to take AI risks seriously, since they came up with the doomsday scenario. It also seems like getting a sense of what direction AI experts think AI will take is useful: if experts are optimistic about machine intelligence hardware design, then FOOMing is more likely. (It might even be useful to ask about areas they're pessimistic about, since that's a different question than danger, but four questions for the first question seems like a bit much.)
Drawback: what you're interested in is cross-domain optimization competence. If people give you numbers based on when machine intelligence will be able to pass a Turing test, those numbers will be mostly useless. Even the numbers Nilsson gave for his 'employable AI' are difficult to compare to numbers other people are giving. But it seems to me that knowing better what they mean is more important than easy comparisons.
Overall, I feel a bit better about lukeprog's rewrite than I do about the original. I do think at least one question about AI risk countermeasures should be preserved- probably something like this:
Q4. What is the ideal level of awareness and funding for AI risk reduction?
Possibly with a clarification that they can either give a dollar number or just compare it to some other cause (like a particular variety of cancer, other existential risks, etc.).
The whole of question 3 seems problematic to me.
Concerning parts (a) and (b), I doubt that researchers will know what you have in mind by "provably friendly." For that matter I myself don't know what you have in mind by "probably friendly" despite having read a number of relevant posts on Less Wrong.
Concerning part (c); I doubt that experts are thinking in terms of money needed to possible mitigate AI risks at all; presumably in most cases if they saw this as a high priority issue and tractable issue they would have written about it already.
Not only that, 3b seems to presuppose that the only dangerous AI is a recursively self-improving one.
My preferred rewrite, without spending too much time on it:
Q1a: Assuming no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of human-level machine intelligence? Feel free to answer ‘never’ if you believe such a milestone will never be reached. reason: this matches question #1 of FHI's [machine intelligence survey.]
Q1b: Once we build AIs that are as skilled at technology design and general reasoning as humans are, how much more difficult will it be for humans and/or AIs to build an AI which is substantially better than humans at technology design and general engineering?
Q2a: Do you ever expect AIs to overwhelmingly outperform humans at typical academic research, in the way that they may soon overwhelmingly outperform humans at trivia contests, or do you expect that humans will always play an important role in scientific progress?
Q2b: [delete to make questions list less dauntingly long, and increase response rate]
Q2c: What probability do you assign to the possibility of an AI with initially (professional) human-level competence at technology design and general reasoning to use its capacities to self-modify its way up to vastly superhuman general capabilities within a matter of hours/days/< 5 years? ('Self modification' may include the first AI creating improved child AIs, which create further-improved child AIs, etc.)
Q3a: How important is it to figure out how to make superhuman AI provably friendly to us and our values (non-dangerous), before attempting to build AI that is good enough at technology design and general reasoning to undergo radical self-modification?
Q3b: What probability do you assign to the possibility of human extinction as a result of AI capable of self-modification (that is not provably non-dangerous, if that is even possible)?
Q3c: [delete to reduce length of questions list]
Q4: [delete to reduce length of questions list]
Q5: [delete to reduce length of questions list; AI experts are unlikely to be experts on other x-risks]
Q6: [delete to reduce length of questions list; I haven't seen, and don't anticipate, useful answers here]
Q7: [delete to reduce length of questions list]
I endorse the "question deletion" idea.
Are these two expressions supposed (or assumed) to be equivalent?
I updated the original post. Maybe we could agree on those questions. Be back tomorrow.
I stand by my preferred rewrites above, but of course it's up to you.
I agree with deleting Q5 and Q6 because not only would I not expect useful responses but also because it may come off as "extremist" if any respondents are not already familiar with UFAI concepts (or if they are familiar and overtly dismissive of them).
Q6 is confusing. Are you asking for P(human extinction by AI that is capable of self-modification and not provably non-dangerous), P(human extinction by AI | AI capable of self-modification and not provably non-dangerous is created), or P(human extinction by first such AI | AI capable of self-modification and not provably non-dangerous is created)?
I think you should replace "superhuman AI" with something like "a singular AI capable of having a catastrophic global impact." Anything that isn't sourced from nerd culture, basically. I also preferred "provably non-dangerous" to "provably friendly."
I am not sure whether anyone thinks that is true. If you look at the claims by E. Yudkowsky they typically say something like:
Yudkowsky appears to be hedging his bets on when this is going to happen - by saying: "at some point". There's not much sign of anything like: "initially (professional) human-level competence".
Does anyone believe such a thing will happen then? At first glance the claim makes little sense: we already know how fast progress goes with agents of "professional human-level competence" knocking around - and it just isn't that fast.
Q2 is too nebulous. What do you mean by "how much more difficult" ? How do you measure "difficulty" ?
Q5 glosses over the main problem: we don't know what "our values" even are. There's wide disagreement on this topic among practically all communities.
Q6 is not entirely clear on whether you're asking for cumulative probability, or a single random variable. You also do not define what "extinction" is.
Q1: I think 'beneficially' should be 'beneficial'.
You might want to include some context, especially about why you think AIs might pose a threat. Yes, there are reasons to not do this but some people seem to not have considered the issue at all, or immediately jump to sci-fi tropes involving robots with human-like desires for power/revenge/...
The main concern I would have here is emailing a busy expert to say "here's a bunch of background material you may or may not be familiar with, please read it then answer some questions" seems like a poor way to get responses.