I agree that a write-up of SIAI's argument for the Scary Idea, in the manner you describe, would be quite interesting to see.
However, I strongly suspect that when the argument is laid out formally, what we'll find is that
-- given our current knowledge about the pdf's of the premises in the argument, the pdf on the conclusion is verrrrrrry broad, i.e. we can't conclude hardly anything with much of any confidence ...
So, I think that the formalization will lead to the conclusion that
-- "we can NOT confidently say, now, that: Building advanced AGI without a provably Friendly design will almost certainly lead to bad consequences for humanity"
-- "we can also NOT confidently say, now, that: Building advanced AGI without a provably Friendly design will almost certainly NOT lead to bad consequences for humanity"
I.e., I strongly suspect the formalization
-- will NOT support the Scary Idea
-- will also not support complacency about AGI safety and AGI existential risk
I think the conclusion of the formalization exercise, if it's conducted, will basically be to reaffirm common sense, rather than to bolster extreme views like the Scary Idea....
-- Ben Goertzel
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)
Actually, you can spell out the argument very briefly. Most people, however, will immediately reject one or more of the premises due to cognitive biases that are hard to overcome.
A brief summary:
Any AI that's at least as smart as a human and is capable of self-improving, will improve itself if that will help its goals
The preceding statement applies recursively: the newly-improved AI, if it can improve itself, and it expects that such improvement will help its goals, will continue to do so.
At minimum, this means any AI as smart as a human, can be expected to become MUCH smarter than human beings -- probably smarter than all of the smartest minds the entire human race has ever produced, combined, without even breaking a sweat.
INTERLUDE: This point, by the way, is where people's intuition usually begins rebelling, either due to our brains' excessive confidence in themselves, or because we've seen too many stories in which some indefinable "human" characteristic is still somehow superior to the cold, unfeeling, uncreative Machine... i.e., we don't understand just how our intuition and creativity are actually cheap hacks to work around our relatively low processing power -- dumb brute force is already "smarter" than human beings in any narrow domain (see Deep Blue, evolutionary algorithms for antenna design, Emily Howell, etc.), and a human-level AGI can reasonably be assumed capable of programming up narrow-domain brute forcers for any given narrow domain.
And it doesn't even have to be that narrow or brute: it could build specialized Eurisko-like solvers, and manage them at least as intelligently as Lenat did to win the Travelller tournaments.
In short, human beings have a vastly inflated opinion of themselves, relative to AI. An AI only has to be as smart as a good human programmer (while running at a higher clock speed than a human) and have access to lots of raw computing resources, in order to be capable of out-thinking the best human beings.
And that's only one possible way to get to ridiculously superhuman intelligence levels... and it doesn't require superhuman insights for an AI to achieve, just human-level intelligence and lots of processing power.
The people who reject the FAI argument are the people who, for whatever reason, can't get themselves to believe that a machine can go from being as smart as a human, to massively smarter in a short amount of time, or who can't accept the logical consequences of combining that idea with a few additional premises, like:
It's hard to predict the behavior of something smarter than you
Actually, it's hard to predict the behavior of something different than you: human beings do very badly at guessing what other people are thinking, intending, or are capable of doing, despite the fact that we're incredibly similar to each other.
AIs, however, will be much smarter than humans, and therefore very "different", even if they are otherwise exact replicas of humans (e.g. "ems").
Greater intelligence can be translated into greater power to manipulate the physical world, through a variety of possible means. Manipulating humans to do your bidding, coming up with new technologies, or just being more efficient at resource exploitation... or something we haven't thought of. (Note that pointing out weaknesses in individual pathways here doesn't kill the argument: there is more than one pathway, so you'd need a general reason why more intelligence doesn't ever equal more power. Humans seem like a counterexample to any such general reason, though.)
You can't control what you can't predict, and what you can't control is potentially dangerous. If there's something you can't control, and it's vastly more powerful than you, you'd better make sure it gives a damn about you. Ants get stepped on, because most of us don't care very much about ants.
Note, by the way, that this means that indifference alone is deadly. An AI doesn't have to want to kill us, it just has to be too busy thinking about something else to notice when it tramples us underfoot.
This is another inferential step that is dreadfully counterintuitive: it seems to our brains that of course an AI would notice, of course it would care... what's more important than human beings, after all?
But that happens only because our brains are projecting themselves onto the AI -- seeing the AI thought process as though it were a human. Yet, the AI only cares about what it's programmed to care about, explicitly or implicitly. Humans, OTOH, care about a ton of individual different things (the LW "a thousand shards of desire" concept), which we like to think can be summarized in a few grand principles.
But being able to summarize the principles is not the same thing as making the individual cares ("shards") be derivable from the general principle. That would be like saying that you could take Aristotle's list of what great drama should be, and then throw it into a computer and have the computer write a bunch of plays that people would like!
To put it another way, the sort of principles we like to use to summarize our thousand shards are just placeholders and organizers for our mental categories -- they are not the actual things we care about... and unless we put those actual things in to an AI, we will end up with an alien superbeing that may inadvertently wipe out things we care about, while it's busy trying to do whatever else we told it to do... as indifferently as we step on bugs when we're busy with something more important to us.
So, to summarize: the arguments are not that complex. What's complex is getting people past the part where their intuition reflexively rejects both the premises and the conclusions, and tells their logical brains to make up reasons to justify the rejection, post hoc, or to look for details to poke holes in, so that they can avoid looking at the overall thrust of the argument.
While my summation here of the anti-Foom position is somewhat unkindly phrased, I have to assume that it is the truth, because none of the anti-Foomers ever seem to actually address any of the pro-Foomer arguments or premises. AFAICT (and I am not associated with SIAI in any way, btw, I just wandered in here off the internet, and was around for the earliest Foom debates on OvercomingBias.com), the anti-Foom arguments always seem to consist of finding ways to never really look too closely at the pro-Foom arguments at all, and instead making up alternative arguments that can be dismissed or made fun of, or arguing that things shouldn't be that way, and therefore the premises should be changed
That was a pretty big convincer for me that the pro-Foom argument was worth looking more into, as the anti-Foom arguments seem to generally boil down to "la la la I can't hear you".
From Ben Goertzel,
At the second Singularity Summit, I heard this same sentiment from Ben, Robin Hanson, and from Rodney Brooks, and from Cynthia Breazeal (at the Third Singularity Summit), and from Ron Arkin (at the "Human Being in an Inhuman Age" Conference at Bard College on Oct 22nd ¹), and from almost every professor I have had (or will have for the next two years).
It was a combination of Ben, Robin and several professors at Berkeley and UCSD which led me to the conclusion that we probably won't know how dangerous an AGI (CGI - Constructed General Intelligence... Seems to be a term I have heard used by more than one person in the last year instead of AI/AGI. They prefer it to AI, as the word Artificial seems to imply that the intelligence is not real, and the word Constructed is far more accurate) is until we have put a lot more time into building AI (or CI) systems that will reveal more about the problems they attempt to address.
Sort of like how the Wright Brothers didn't really learn how they needed to approach building an airplane until they began to build airplanes. The final Wright Flyer didn't just leap out of a box. It is not likely that an AI will just leap out of a box either (whether it is being built at a huge Corporate or University lab, or in someone's home lab).
Also, it is possible that AI may come in the form of a sub-symbolic system which is so opaque that even it won't be able to easily tell what can or cannot be optimized.
Ron Arkin (From Georgia Tech) discussed this briefly at the conference at Bard College I mentioned.
MB
¹ I should really write up something about that conference here. I was shocked at how many highly educated people so completely missed the point, and became caught up in something that makes The Scary Idea seem positively benign in comparison.