I only have time for a short reply:
(1) I'd rephrase the above to say that computer security is among the two most important things one can study with regard to this alleged threat.
(2) The other important thing is law. Law is the "offensive approach to the problem of security" in the sense I suspect you mean it (unless you mean something more like the military). Law is very highly evolved, the work of millions of people as smart or smarter than Yudkoswky over more than a millenium, and tested empirically against the real world of real agents with a real diversity of values every day. It's not something you can ever come close to competing with by a philosophy invented from scratch.
(3) I stand by my comment that "AGI" and "friendliness" are hopelessly anthropomorphic, infeasible, and/or vague.
(4) Computer "goals" are only usefully studied against actual algorithms, or clearly defined mathemetical classes of algorithms, not vague and imaginary concepts. Perhaps you can make some progress by for example advancing the study of postconditions, which seem to be the closest analog to goals in the software engineering world. One can imagine a world where postconditions are always checked, for example, and other software ignores the output of software that has violated one of its postconditions.
The other important thing is law. Law is the "offensive approach to the problem of security" in the sense I suspect you mean it (unless you mean something more like the military). Law is very highly evolved, the work of millions of people as smart or smarter than Yudkoswky over more than a millenium, and tested empirically against the real world of real agents with a real diversity of values every day. It's not something you can ever come close to competing with by a philosophy invented from scratch.
As a lawyer, I strongly suspect this statement is false. As you seem to be referring to the term, Law is society's organizational rules about how and when to implement coercive violence. In the abstract, this is powerful, but concretely, this power is implemented by individuals. Some of them (i.e. police officers), care relatively little about the abstract issues - in other words, they aren't careful about the issues that are relevant to AI.
Further, law is filled with backdoors - they are called legislators. In the United States, Congress can make almost any judicially announced rule irrelevant by passing a statute. If you call that process "Law," then you aren't ...
Law is very highly evolved, the work of millions of people as smart or smarter than Yudkoswky over more than a millenium,
That seems pretty harsh! The Bureau of Labor Statistics reports 728,000 lawyers in the U.S., a notably attorney-heavy society within the developed world. The SMPY study of kids with 1 in 10,000 cognitive test scores found (see page 722) only a small minority studying law. The 90th percentile IQ for "legal occupations" in this chart is a little over 130. Historically populations were much lower, nutrition was worse, legal education or authority was only available to a small minority, and the Flynn Effect had not occurred. Not to mention that law is disproportionately made by politicians who are selected for charisma and other factors in addition to intelligence.
and tested empirically against the real world of real agents with a real diversity of values every day. It's not something you can ever come close to competing with by a philosophy invented from scratch.
It's hard to know what to make of this.
Perhaps that the legal system is good at creating incentives that closely align the interests of those it governs with the social good, and that thi...
It's not something you can ever come close to competing with by a philosophy invented from scratch.
I don't understand what you mean by this. Are you saying something like if a society was ever taken over by a Friendly AI, it would fail to compete against one ruled by law, in either a military or economic sense? Or do you mean "compete" in the sense of providing the most social good. Or something else?
I stand by my comment that "AGI" and "friendliness" are hopelessly anthropomorphic, infeasible, and/or vague.
I disagree with "hopelessly" "anthropomorphic" and "vague", but "infeasible" I may very well agree with, if you mean something like it's highly unlikely that a human team would succeed in creating a Friendly AGI before it's too late to make a difference and without creating unacceptable risk, which is why I advocate more indirect methods of achieving it.
Computer "goals" are only usefully studied against actual algorithms, or clearly defined mathemetical classes of algorithms, not vague and imaginary concepts.
People are trying to design such algorithms, things like practical approximations ...
It's not something you can ever come close to competing with by a philosophy invented from scratch.
I don't understand what you mean by this.
A sufficient cause for Nick to claim this would be that he believed that no human-conceivable AI design would be able to incorporate by any means, including by reasoning from first principles or even by reference, anything functionally equivalent to the results of all the various dynamics of updating that have (for instance) made present legal systems as (relatively) robust (against currently engineerable methods of exploitation) as they are.
This seems somewhat strange to you, because you believe humans can conceive of AI designs that could reason some things from first principles (given observations of the world that the reasoning needed to be relevant to, plus reasonably anticipatable advantages of computing power over single humans) or incorporate results by reference.
One possible reason he might believe this would be that he believed that, whenever a human reasons about history or evolved institutions, there are something like two distinct levels of a computational complexity hierarchy at work, and that the powers of the greater level...
If anyone's looking for a computer security project that could be relevant to AI risk scenarios, consider looking at the architecture that an AI box would have, and the software at its surface. While boxing is not sufficient, it is useful as part of a defense-in-depth strategy, and it's a relatively cheap and good precaution for groups that don't think their creations will become generally intelligent with much probability. A box would offer some protection against AI escaping before its birth has been noticed. I'm particularly concerned about terminals (iTerm, xterm, PuTTY, etc), which would most likely interact with the contents of the box in any boxed AI setup, and all seem to be neglected targets written in unsafe languages (ie, C and C++).
It would also be nice to have hardware and software for fake (alarmed) network interfaces, soundcard antennas, and pretend VM hosts. Rather than close off the tricks that an AI of unrecognized generality might use to escape its box, we should replace them all with traps; and since AI researchers probably won't make these things themselves, we should provide ready-made solutions for them.
I find it odd that Nick refers to "AGI goals" as an "anthropomorphic [and] hopelessly vague" idea. One model for AGI goals, for example, is the utility function, which is neither anthropomorphic (since humans don't have them) nor vague.
nor vague.
It seems somewhat vague to me in the sense that the domain of the function is underspecified. Is it valuing sensory inputs? Is it valuing mental models? Is it valuing external reality? Is that at all related to what humans would recognize as "goals" (say, the goal of visiting London)?
FAI is a security risk not a fix:
"One way to think about Friendly AI is that it's an offensive approach to the problem of security (i.e., take over the world), instead of a defensive one."
Not if the AI itself is vulnerable to penetration. By your own reasoning, we have no reason to think they won't be. They may turn out to be one of the biggest security liabilities because the way it executes tasks may be very intelligent and there's no reason to believe they won't be reprogrammed to do unfriendly things.
Friendly AI is only friendly until a human figures out how to abuse it.
Security is solving the problem after the fact, and I think that is totally the wrong approach here, we should be asking if something can be designed into the AI that prevents people from wanting to take the AI over or prevents takeovers from being disastrous (three suggestions for that are included in this comment).
Perhaps the best approach to security is to solve the problems humans are having that cause them to commit crimes. Of course this appears to be a chicken-or-egg proposition "Well, the AI can't solve the problems until it's securely built,...
This argument seems be following a common schema:
To understand X, it is necessary to understand its relations to other things in the world.
But to understand its relations to each of the other things that exist, it is necessary to understand each of those things as well.
Y describes many of the things that commonly interact with X.
Therefore, the best way to advance our understanding of X, is to learn about Y.
Is that a fair description of the structure of the argument? If so, are you arguing that our understanding of superintelligence needs to be advanced th...
"One Way Functions" aren't strictly one-way; they are just much harder to calculate in one direction than the other. A breakthrough in algorithms, or a powerful enough computer, can solve the problem.
Not really an answer to your question, but it seems to me a lot depends on what position I take wrt value drift and the subject-dependence of values.
At one extreme: if I believe that whatever I happen to value right now is what I value, and what I value tomorrow is what I value tomorrow, and it simply doesn't matter how those things relate to each other, I just want to optimize my environment for what I value at any given moment, then it makes sense to concentrate on security without reference to goals. More precisely, it makes sense to concentrate on mech...
No go. Four reasons.
One:
If the builders have increased their intelligence levels that high, then other people of that time will be able to do the same and therefore potentially crack the AI.
Two:
Also, I may as well point out that your argument is based on the assumption that enough intelligence will make for perfect security. It may be that no matter how intelligent the designers are, their security plans are not perfect. Perfect security looks to be about as likely, to me, as perpetual motion is. No matter how much intelligence you throw at it, you won't get a perpetual motion machine. We'd need to discover some paradigm shattering physics information for that to be so. I suppose it is possible that someone will shatter the physics paradigms by discovering new information, but that's not something to count on to build a perpetual motion machine, especially when you're counting on the perpetual motion machine to keep the world safe.
Three:
Whenever humans have tried to collect too much power into one place, it has not worked out for them. For instance, communism in Russia. They thought they'd share all the money by letting one group distribute it. That did not work.
The founding fathers of the USA insisted on checking and balancing the government's power. Surely you are aware of the reasons for that.
If the builders are the only ones in the world with intelligence levels that high, the power of that may corrupt them, and they may make a pact to usurp the AI themselves.
Four:
There may be unexpected thoughts you encounter in that position that seem to justify taking advantage of the situation. For instance, before becoming a jailor, you would assume you're going to be ethical and fair. In that situation, though, people change. (See also: Zimbardo's Stanford prison experiment).
Why do they change? I imagine the reasoning goes a little like this: "Great I'm in control. Oh, wait. Everyone wants to get out. Okay. And they're a threat to me because I'm keeping them in here. I'm going to get into a lot of power struggles in this job. Considering that even if I fail only 1% of the time, the consequences of failing at a power struggle are very dire, so I should probably err on the side of caution - use too much force rather than too little. And if it's okay to use physical force, then how bad is using a little psychological oppression as a deterrent? That will be a bit of extra security for me and help me maintain order in this jail. Considering the serious risk, and the high chance of injury, it's necessary to use everything I've got."
We don't know what kinds of reasoning processes the AI builders will get into at that time. They might be thinking like this:
"We're going to make the most powerful thing in the world, yay! But wait, everyone else wants it. They're trying to hack us, spy on us... there are people out there who would kidnap us and torture us to get a hold of this information. They might do all kinds of horrible things to us. Oh my goodness and they're not going to stop trying to hack us when we're done. Our information will still be valuable. I could get kidnapped years from now and be tortured for this information then. I had better give myself some kind of back door into the AI, something that will make it protect me when I need it. (A month later) Well... surely it's justified to use the back door for this one thing... and maybe for that one thing, too... man I've got threats all over me, if I don't do this perfectly, I'll probably fail ... even if I only make a mistake 1 in 100 times, that could be devastating. (Begins using the back door all the time.) And I'm important. I'm working on the most powerful AI. I'm needed to make a difference in the world. I had better protect myself and err on the side of caution. I could do these preventative things over here... people won't like the limits I place on them, but the pros outweigh the cons, so: oppress." The limits may be seen as evidence that the AI builders cannot be trusted (regardless of how justified they are, there will be some group of people who feels oppressed by new limits, possibly irrational people or possibly people who see a need for the freedom that the AI builders don't) and if a group of people are angry about the limits, they will then be opposed to the AI builders. If they begin to resist the AI builders, the AI builders will be forced to increase security, which may oppress them further. This could be a feedback loop that gets out of hand: Increasing resistance to the AI builders justifies increasing oppression, and increasing oppression justifies increasing resistance.
This is how an AI builder could turn into a jailor.
If part of the goal is to create an AI that will enforce laws, the AI researchers will be part of the penal system, literally. We could be setting ourselves up for the world's most spectacular prison experiment.
Checks and balances, Wei_Dai.
-- Nick Szabo
Nick Szabo and I have very similar backrounds and interests. We both majored in computer science at the University of Washington. We're both very interested in economics and security. We came up with similar ideas about digital money. So why don't I advocate working on security problems while ignoring AGI, goals and Friendliness?
In fact, I once did think that working on security was the best way to push the future towards a positive Singularity and away from a negative one. I started working on my Crypto++ Library shortly after reading Vernor Vinge's A Fire Upon the Deep. I believe it was the first general purpose open source cryptography library, and it's still one of the most popular. (Studying cryptography led me to become involved in the Cypherpunks community with its emphasis on privacy and freedom from government intrusion, but a major reason for me to become interested in cryptography in the first place was a desire to help increase security against future entities similar to the Blight described in Vinge's novel.)
I've since changed my mind, for two reasons.
1. The economics of security seems very unfavorable to the defense, in every field except cryptography.
Studying cryptography gave me hope that improving security could make a difference. But in every other security field, both physical and virtual, little progress is apparent, certainly not enough that humans might hope to defend their property rights against smarter intelligences. Achieving "security against malware as strong as we can achieve for symmetric key cryptography" seems quite hopeless in particular. Nick links above to a 2004 technical report titled "Polaris: Virus Safe Computing for Windows XP", which is strange considering that it's now 2012 and malware have little trouble with the latest operating systems and their defenses. Also striking to me has been the fact that even dedicated security software like OpenSSH and OpenSSL have had design and coding flaws that introduced security holes to the systems that run them.
One way to think about Friendly AI is that it's an offensive approach to the problem of security (i.e., take over the world), instead of a defensive one.
2. Solving the problem of security at a sufficient level of generality requires understanding goals, and is essentially equivalent to solving Friendliness.
What does it mean to have "secure property rights", anyway? If I build an impregnable fortress around me, but an Unfriendly AI causes me to give up my goals in favor of its own by crafting a philosophical argument that is extremely convincing to me but wrong (or more generally, subverts my motivational system in some way), have I retained my "property rights"? What if it does the same to one of my robot servants, so that it subtly starts serving the UFAI's interests while thinking it's still serving mine? How does one define whether a human or an AI has been "subverted" or is "secure", without reference to its "goals"? It became apparent to me that fully solving security is not very different from solving Friendliness.
I would be very interested to know what Nick (and others taking a similar position) thinks after reading the above, or if they've already had similar thoughts but still came to their current conclusions.