I am emailing experts in order to raise and estimate the academic awareness and perception of risks from AI. Below are some questions I am going to ask. Please help to refine the questions or suggest new and better questions.
(Thanks goes to paulfchristiano, Steve Rayhawk and Mafred.)
Q1: Assuming beneficially political and economic development and that no global catastrophe halts progress, by what year would you assign a 10%/50%/90% chance of the development of artificial intelligence that is roughly as good as humans at science, mathematics, engineering and programming?
Q2: Once we build AI that is roughly as good as humans at science, mathematics, engineering and programming, how much more difficult will it be for humans and/or AIs to build an AI which is substantially better at those activities than humans?
Q3: Do you ever expect artificial intelligence to overwhelmingly outperform humans at typical academic research, in the way that they may soon overwhelmingly outperform humans at trivia contests, or do you expect that humans will always play an important role in scientific progress?
Q4: What probability do you assign to the possibility of an AI with initially (professional) human-level competence at general reasoning (including science, mathematics, engineering and programming) to self-modify its way up to vastly superhuman capabilities within a matter of hours/days/< 5 years?
Q5: How important is it to figure out how to make superhuman AI provably friendly to us and our values (non-dangerous), before attempting to build AI that is good enough at general reasoning (including science, mathematics, engineering and programming) to undergo radical self-modification?
Q6: What probability do you assign to the possibility of human extinction as a result of AI capable of self-modification (that is not provably non-dangerous, if that is even possible)?
Yes, there are things that forbid this. Typically when we design a CPU, one of the design requirements is that no sequence of instructions can alter the hardware in irreversible ways. A reset should really put it back to a consistent state. Yes, it's possible that the hardware has the potential for unexpected alteration from software, but I wouldn't bet on that as a magic capability without real evidence. It takes a lot of energy to alter silicon and digital logic circuits just don't have that kind of power.
So, given a correctly-designed CPU, any positive-feedback loop here has to go off-chip, which usually means "through humans". And humans are slow and error-prone, so that imposes a lot of lag in the feedback loop.
I believe that a human-machine system will steadily improve over time. But it doesn't seem, based on past experience, that there's unlimited positive feedback. We've hit limits in hardware performance, despite using sophisticated machines and algorithms for design. We've hit limits in software performance -- some problems really are intractable and others are undecidable.
So where's the evidence that a single software program can improve its capabilities in an uncontrolled fashion, much more quickly than the surrounding society?
Just to make sure I understand you: if A is a program that has full access to its source code and the specifications of the hardware it's running on, and A designs a new machine infrastructure and applies pressure to the world (e.g., through money or blackmail or whatever works) to induce humans to build an instance of that machine, B, such that B allows software-mediated hardware modification (for example, by having an automated chip-manufacturing plant attached to it), you would say that B is an "incorrectly-designed" CPU that might allow for a... (read more)