People hear about Friendly AI and say - this is one of the top three initial reactions:
"Oh, you can try to tell the AI to be Friendly, but if the AI can modify its own source code, it'll just remove any constraints you try to place on it."
And where does that decision come from?
Does it enter from outside causality, rather than being an effect of a lawful chain of causes which started with the source code as originally written? Is the AI the Author* source of its own free will?
A Friendly AI is not a selfish AI constrained by a special extra conscience module that overrides the AI's natural impulses and tells it what to do. You just build the conscience, and that is the AI. If you have a program that computes which decision the AI should make, you're done. The buck stops immediately.
At this point, I shall take a moment to quote some case studies from the Computer Stupidities site and Programming subtopic. (I am not linking to this, because it is a fearsome time-trap; you can Google if you dare.)
I tutored college students who were taking a computer programming course. A few of them didn't understand that computers are not sentient. More than one person used comments in their Pascal programs to put detailed explanations such as, "Now I need you to put these letters on the screen." I asked one of them what the deal was with those comments. The reply: "How else is the computer going to understand what I want it to do?" Apparently they would assume that since they couldn't make sense of Pascal, neither could the computer.
While in college, I used to tutor in the school's math lab. A student came in because his BASIC program would not run. He was taking a beginner course, and his assignment was to write a program that would calculate the recipe for oatmeal cookies, depending upon the number of people you're baking for. I looked at his program, and it went something like this:
10 Preheat oven to 350
20 Combine all ingredients in a large mixing bowl
30 Mix until smooth
An introductory programming student once asked me to look at his program and figure out why it was always churning out zeroes as the result of a simple computation. I looked at the program, and it was pretty obvious:
begin
read("Number of Apples", apples)
read("Number of Carrots", carrots)
read("Price for 1 Apple", a_price)
read("Price for 1 Carrot", c_price)
write("Total for Apples", a_total)
write("Total for Carrots", c_total)
write("Total", total)
total = a_total + c_total
a_total = apples * a_price
c_total = carrots * c_price
endMe: "Well, your program can't print correct results before they're computed."
Him: "Huh? It's logical what the right solution is, and the computer should reorder the instructions the right way."
There's an instinctive way of imagining the scenario of "programming an AI". It maps onto a similar-seeming human endeavor: Telling a human being what to do. Like the "program" is giving instructions to a little ghost that sits inside the machine, which will look over your instructions and decide whether it likes them or not.
There is no ghost who looks over the instructions and decides how to follow them. The program is the AI.
That doesn't mean the ghost does anything you wish for, like a genie. It doesn't mean the ghost does everything you want the way you want it, like a slave of exceeding docility. It means your instruction is the only ghost that's there, at least at boot time.
AI is much harder than people instinctively imagined, exactly because you can't just tell the ghost what to do. You have to build the ghost from scratch, and everything that seems obvious to you, the ghost will not see unless you know how to make the ghost see it. You can't just tell the ghost to see it. You have to create that-which-sees from scratch.
If you don't know how to build something that seems to have some strange ineffable elements like, say, "decision-making", then you can't just shrug your shoulders and let the ghost's free will do the job. You're left forlorn and ghostless.
There's more to building a chess-playing program than building a really fast processor - so the AI will be really smart - and then typing at the command prompt "Make whatever chess moves you think are best." You might think that, since the programmers themselves are not very good chess-players, any advice they tried to give the electronic superbrain would just slow the ghost down. But there is no ghost. You see the problem.
And there isn't a simple spell you can perform to - poof! - summon a complete ghost into the machine. You can't say, "I summoned the ghost, and it appeared; that's cause and effect for you." (It doesn't work if you use the notion of "emergence" or "complexity" as a substitute for "summon", either.) You can't give an instruction to the CPU, "Be a good chessplayer!" You have to see inside the mystery of chess-playing thoughts, and structure the whole ghost from scratch.
No matter how common-sensical, no matter how logical, no matter how "obvious" or "right" or "self-evident" or "intelligent" something seems to you, it will not happen inside the ghost. Unless it happens at the end of a chain of cause and effect that began with the instructions that you had to decide on, plus any causal dependencies on sensory data that you built into the starting instructions.
This doesn't mean you program in every decision explicitly. Deep Blue was a far superior chessplayer than its programmers. Deep Blue made better chess moves than anything its makers could have explicitly programmed - but not because the programmers shrugged and left it up to the ghost. Deep Blue moved better than its programmers... at the end of a chain of cause and effect that began in the programmers' code and proceeded lawfully from there. Nothing happened just because it was so obviously a good move that Deep Blue's ghostly free will took over, without the code and its lawful consequences being involved.
If you try to wash your hands of constraining the AI, you aren't left with a free ghost like an emancipated slave. You are left with a heap of sand that no one has purified into silicon, shaped into a CPU and programmed to think.
Go ahead, try telling a computer chip "Do whatever you want!" See what happens? Nothing. Because you haven't constrained it to understand freedom.
All it takes is one single step that is so obvious, so logical, so self-evident that your mind just skips right over it, and you've left the path of the AI programmer. It takes an effort like the one I showed in Grasping Slippery Things to prevent your mind from doing this.
Ben, you could be right that my "world is too fuzzy" view is just mind projection, but let me at least explain what I am projecting. The most natural way to get "unlimited" control over matter is a pure reductionist program in which a formal mathematical logic can represent designs and causal relationships with perfect accuracy (perfect to the limits of quantum probabilities). Unfortunately, combinatorial explosion makes that impractical. What we can actually do instead is redescribe collections of matter in new terms. Sometimes these are neatly linked to the underlying physics and we get cool stuff like f=ma but more often the redescriptions are leaky but useful "concepts". The fact that we have to leak accuracy (usually to the point where definitions themselves are basically impossible) to make dealing with the world tractable is what I mean by "the world is too fuzzy to support much intelligent manipulation". In certain special cases we come up with clever ways to bound probabilities and produce technological wonders... but transhumanist fantasies usually make the leap to assume that all things we desire can be tamed in this way. I think this is a wild leap. I realize most futurists see this as unwarranted pessimism and that the default position is that anything imaginable that doesn't provably violate the core laws of physics only awaits something smart enough to build it.
My other reasons for doubting the ultimate capabilities of RSI probably don't need more explanation. My skepticism about the imminence of RSI as a threat (never mind the overall ability of RSI itself) is more based on the ideas that 1) The world is really damn complicated and it will take a really damn complicated computer to make sense of it (the vast human data sorting machinery is well beyond Roadrunner and is not that capable anyway), and 2) there is still no beginning of a credible theory of how to make sense of a really damn complicated world with software.
I agree it is "very dangerous" to put a low probability on any particular threat being an imminent concern. Many such threats exist and we make this very dangerous tentative conclusion every day... from cancer in our own bodies to bioterror to the possibility that our universe is a simulation designed to measure how long it takes us to find the mass of the Higgs, after which we will be shut off.
That is all just an aside though to my main point, which was that if I'm wrong, the only conclusion I can see is that an explicit program to take over the world with a Friendly AI is the only reasonable option.
I approve of such an effort. If my skepticism is correct it will be impossible for decades at least; if I'm wrong I'd rather have an RSI that at least tried to be Friendly. It does seem that the Friendliness bit is more important than the RSI part as the start of such an effort.