https://slatestarcodex.com/2019/08/27/book-review-reframing-superintelligence/
Drexler asks: what if future AI looks a lot like current AI, but better?
For example, take Google Translate. A future superintelligent Google Translate would be able to translate texts faster and better than any human translator, capturing subtleties of language beyond what even a native speaker could pick up. It might be able to understand hundreds of languages, handle complicated multilingual puns with ease, do all sorts of amazing things. But in the end, it would just be a translation app. It wouldn’t want to take over the world. It wouldn’t even “want” to become better at translating than it was already. It would just translate stuff really well.
...
In this future, our AI technology would have taken the same path as our physical technology. The human body can run fast, lift weights, and fight off enemies. But the automobile, crane, and gun are three different machines. Evolution had to cram running-ability, lifting-ability, and fighting-ability into the same body, but humans had more options and were able to do better by separating them out. In the same way, evolution had to cram book-writing, technology-inventing, and strategic-planning into the same kind of intelligence – an intelligence that also has associated goals and drives. But humans don’t have to do that, and we probably won’t. We’re not doing it today in 2019, when Google Translate and AlphaGo are two different AIs; there’s no reason to write a single AI that both translates languages and plays Go. And we probably won’t do it in the superintelligent future either. Any assumption that we will is based more on anthropomorphism than on a true understanding of intelligence.
These superintelligent services would be safer than general-purpose superintelligent agents. General-purpose superintelligent agents (from here on: agents) would need a human-like structure of goals and desires to operate independently in the world; Bostrom has explained ways this is likely to go wrong. AI services would just sit around algorithmically mapping inputs to outputs in a specific domain.
A takeaway:
I think Drexler’s basic insight is that Bostromian agents need to be really different from our current paradigm to do any of the things Bostrom predicts. A paperclip maximizer built on current technology would have to eat gigabytes of training data about various ways people have tried to get paperclips in the past so it can build a model that lets it predict what works. It would build the model on its actually-existing hardware (not an agent that could adapt to much better hardware or change its hardware whenever convenient). The model would have a superintelligent understanding of the principles that had guided some things to succeed or fail in the training data, but wouldn’t be able to go far beyond them into completely new out-of-the-box strategies. It would then output some of those plans to a human, who would look them over and make paperclips 10% more effectively.
The very fact that this is less effective than the Bostromian agent suggests there will be pressure to build the Bostromian agent eventually (Drexler disagrees with this, but I don’t understand why). But this will be a very different project from AI the way it currently exists, and if AI the way it currently exists can be extended all the way to superintelligence, that would give us a way to deal with hostile superintelligences in the future.
"Ten years ago, everyone was talking about superintelligence, the singularity, the robot apocalypse. What happened?"
What is this referencing? I was only 10 years old in 2009 but I have a strong impression that AI risk gets a lot more attention now than it did then.
Also, what are the most salient differences between CAIS and the cluster of concepts Karnofsky and others were calling "Tool AI"?
It might also be worth comparing CAIS and "tool AI" to Paul Christiano's IDA and the desiderata MIRI tends to talk about (task-directed AGI [1,2,3], mild optimization, limited AGI).
At a high level, I tend to think of Christiano and Drexler as both approaching alignment from very much the right angle, in that they're (a) trying to break apart the vague idea of "AGI reasoning" into smaller parts, and (b) shooting for a system that won't optimize harder (or more domain-generally) than we need for a given task. From conversat... (read more)