"And I heard a voice saying 'Give up! Give up!' And that really scared me 'cause it sounded like Ben Kenobi." (source)
Friendly AI is a humongous damn multi-genius-decade sized problem. The first step is to realize this, and the second step is to find some fellow geniuses and spend a decade or two solving it. If you're looking for a quick fix you're out of luck.
The same (albeit to a lesser degree) is fortunately also true of Artificial General Intelligence in general, which is why the hordes of would-be meddling dabblers haven't killed us all already.
This article (which I happened across today) written by Ben Goertzel should make interesting reading for a would-be AI maker. It details Ben's experience trying to build an AGI during the dot-com bubble. His startup company, Webmind, Inc., apparently had up to 130 (!) employees at its peak.
According to the article, the AGI was almost completed, and the main reason his effort failed was that the company ran out of money due to the bursting of the bubble. Together with the anthropic principle, this seems to imply that Ben is the person responsible for the stock market crash of 2000.
I was always puzzled why SIAI hired Ben Goertzel to be its research director, and this article only deepens the mystery. If Ben has done an Eliezer-style mind-change since writing that article, I think I've missed it.
ETA: Apparently Ben has recently been helping his friend Hugo de Garis build an AI at Xiamen University under a grant from the Chinese government. How do you convince someone to give up building an AGI when your own research director is essentially helping the Chinese government build one?
Update for anyone that comes across this comment: Ben Goertzel recently tweeted that he will be taking over Hugo de Garis's lab, pending paperwork approval.
By the way, I noticed from my server logs that the Institute for Defense Analyses seems to be reading LW.
Most likely, someone working there just happens to.
For useful-tool AI, learn stuff from statistics and machine learning before making any further moves.
For self-improving AI, just don't do it as AI, FAI is not quite an AI problem, and anyway most techniques associated with "AI" don't work for FAI. Instead, learn fundamental math and computer science, to a good level -- that's my current best in-a-few-words advice for would-be FAI researchers.
Create a hardware device that would be fatal to the programmer. Allow it to be activated by a primitive action that the program could execute. Give the primitive a high apparent utility. Code the AI however he wants.
If he gets cold sweats every time he does a test run, the rest of us will probably be OK.
I suggest that working in the field of brain emulation is a way for anyone to actively contribute to safety.
If emulations come first, it won't take a miracle to save the human race; our existing systems of politics and business will generate a satisficing solution.
Tim, I think that what worries me is the "detailed reliable inheritance from human morals and meta-morals" bit. The worry that there will not be "detailed reliable inheritance from human morals and meta-morals" is robust to what specific way you think the future will go. Ems can break the inheritance. The first, second or fifteenth AGI system can break it. Intelligence enhancement gone wrong can break it. Any super-human "power" that doesn't explicitly preserve it will break it.
All the examples you cite differ in the substantive dimension: the failure of attempt number 1 doesn't preclude the success of attempt number two.
In the case of the future of humanity, the first failure to pass the physical representation of human morals and metamorals on to the next timeslice of the universe is game over.
That is something we worry about from time to time, but in this case I think the downvotes are justified. Tim Tyler has been repeating a particular form of techno-optimism for quite a while, which is fine; it's good to have contrarians around.
However, in the current thread, I don't think he's taking the critique seriously enough. It's been pointed out that he's essentially searching for reasons that even a Paperclipper would preserve everything of value to us, rather than just putting himself in Clippy's place and really asking for the most efficient way to maximize paperclips. (In particular, preserving the fine details of a civilization, let alone actual minds from it, is really too wasteful if your goal is to be prepared for a wide array of possible alien species.)
I feel (and apparently, so do others) that he's just replying with more arguments of the same kind as the ones we generally criticize, rather than finding other types of arguments or providing a case why anthropomorphic optimism doesn't apply here.
In any case, thanks for the laugh line:
You went over some peoples heads.
My analysis of Tim Tyler in this thread isn't very positive, but his replies seem quite clear to me; I'm frustrated on the meta-level rather than the object-level.
Due to the lack of details, it is difficult to make a recomendation, but some thoughts.
Both as an AGI challenge and for general human safety, business intelligence datawarehouses are probably a good bet. Any pattern undetected by humans detected by an AI could mean good money, which could feedback into more resources for the AI. Also, the ability of corporations to harm others doesn't increase significantly with a better business intelligence tool.
Virtual worlds - If the AI is tested in an isolated virtual world, that will be better for us. Test it in ...
This seems rather relevant - and suggests the answer is go watch more TV. Or, at least, I felt it really needed to be linked here, and this gave me the perfect opportunity!
There isn't really a general answer to "how to design a safe AI". It really depends what the AI is used for (and what they mean by AI).
For recursively self-improving AI, you've got your choice of "it's always bad", "You should only do it the SIAI way (and they haven't figured that out yet)", or "It's not a big deal, just use sofware best practices and iterate".
For robots, I've argued in the past that robots need to share our values in order to avoid squashing them, but I haven't seen anyone work this out rigorously....
If you want to design a complex malleable AI design and have some guarantees about what it will do (rather than just fail in some creative way), think of simple properties you can prove about your code, and then try and prove them using Coq or other theorem proving system.
If you can't think of any properties that you want to hold for your system, think more.
For solving the Friendly AI problem, I suggest the following constraints for your initial hardware system:
1.) All outside input (and input libraries) are explicitly user selected. 2.) No means for the system to establish physical action (e.g., no robotic arms.) 3.) No means for the system to establish unexpected communication (e.g., no radio transmitters.)
Once this closed system has reached a suitable level of AI, then the problem of making it friendly can be worked on much easier and more practically, and without risk of the world ending.
To start out fr...
My current toy thinking along these lines is imagining a program that will write a program to solve the towers of hanoi, given only some description of the problem, and do nothing else, using only fixed computational resources for the whole thing.
I think that's safe, and would illustrate useful principles for FAI.
Try to build an AI that:
Such an AI is more likely to participate in trades across universes, possibly with a friendly AI that requests our survival.
[EDIT]: It now occurs to me that an AI that participates in inter-universal trade would also participate in inter-universal terrorism, so I'm no longer confident that my suggestions above are good ones.
What does "AI programming" even mean ? If he's trying to make some sort of an abstract generally-intelligent AI, then he'll be wasting his time, since the probability of him succeeding is somewhere around epsilon. If he's trying to make an AI for some specific purpose, then I'd advise him to employ lots of testing and especially cross-validation, to avoid overfitting. Of course, if his purpose is something like "make the smartest killer drone ever", then I'd prefer him to fail...
I've read through the AI-Box experiment, and I can still say that I recommend the "sealed AI" tactic. The Box experiment isn't very convincing at all to me, which I could go into detail about, but that would require a whole post. But of course, I'll never develop the karma to do that because apparently the rate at which I ask questions of proper material exceeds the rate at which I post warm, fuzzy comments. Well, at least I have my own blog...
A friend of mine is about to launch himself heavily into the realm of AI programming. The details of his approach aren't important; probabilities dictate that he is unlikely to score a major success. He's asked me for advice, however, on how to design a safe(r) AI. I've been pointing him in the right directions and sending him links to useful posts on this blog and the SIAI.
Do people here have any recommendations they'd like me to pass on? Hopefully, these may form the basis of a condensed 'warning pack' for other AI makers.
Addendum: Advice along the lines of "don't do it" is vital and good, but unlikely to be followed. Coding will nearly certainly happen; is there any way of making it less genocidally risky?