Short versions of the basic premise about FAI
I've been using something like "A self-optimizing AI would be so powerful that it will just roll over the human race unless it's programmed to not do that."
Any others?
I've been using something like "A self-optimizing AI would be so powerful that it will just roll over the human race unless it's programmed to not do that."
Any others?
Comments (23)
I like the phrase "we are edible", because it's short and evocative, but it's not really optimized to communicate with the uninitiated.
If the Laws of Thermodynamics are correct then:
(A) There is a limited, non-replenishable amount of free energy in the universe.
(B) Everything anyone does uses up free energy.
(C) When you run out of free energy you die.
(D) Anything a human could do an AI god could do at a lower free energy cost.
(E) Humans use up lots of free energy.
Consequently:
If an AI god didn't like humans it would extinguish us.
If an AI god were indifferent towards humans it would extinguish us to save free energy.
If an AI god had many goals including friendliness towards humanity then it would have an internal conflict because although it would get displeasure from extinguishing humans, killing us would allow it to have more free energy to devote to its other objectives.
We are only safe if an AI god's sole objective is friendliness towards humanity.
It's a bit ironic that current supercomputers are hugely less energy efficient (megaWatts) that human brains (20W).
One of the interesting observations in computing is that Moore's law of processing power is almost as much a Moore's law of energy efficiency. This makes sense since ultimately you have to deal with the waste heat, so if energy consumption (and hence heat production) were not halving roughly every turn of Moore's law, quickly you'd wind up in a situation where you simply cannot run your faster hotter new chips.
This leads to Ozkural's projection that increasing (GPU) energy efficiency is the real limit on any widespread economical use of AI, and given past improvements, we'll have the hardware capability to run cost-effective neuromorphic AI by 2026 and then the wait is just software based...
I personally don't think we need to talk about self-improving AI at all to consider the problem of friendliness. I would say a viable alternative statement is "Evolution has shaped the values of human minds. Such preferences will not exist in engineered minds unless they are explicitly engineered. Human values are complex, so explicit engineering will be extremely difficult or impossible."
Self-optimization is what makes friendliness a serious problem.
Potentially yes, but I think the problem can be profitably restated without any reference to the Singularity or FOOMing AI. (I've often wondered whether the Friendliness problem would be better recognized and accepted if it was presented without reference to the Singularity).
Edit: See also Vladimir Nesov's summary, which is quite good, but not quite as short as you're looking for here.
Friendliness would certainly be worth pursuing-- it applies to a lot of human issues in addition to what we want from computer programs.
Still, concerns about FOOM is the source of urgency here.
Concerns about FOOM are also what makes SIAI look like (and some posters talk like) a loony doom cult.
Skip the "instant godlike superintelligence with nanotech arms" shenanigans, and AI ethics still remains an interesting and important problem, as you observed.
But it's much easier to get people to look at an interesting problem so you can then persuade them that it's serious, than it is to convince them that they are about to die in order to make them look at your problem. Especially since modern society has so inured people to apocalyptic warnings that the wiser half of the population takes them with a few kilograms of salt to begin with.
The Hidden Complexity of Wishes
I do not understand your point. Would you care to explain?
Sorry, I thought that post was a pretty good statement of the Friendliness problem, sans reference to the Singularity (or even any kind of self-optimization), but perhaps I misunderstood what you were looking for.
Oh, I misunderstood your link. I agree, that's a good summary of the idea behind the "complexity of value" hypothesis.
How I attempted to nutshell it for the RW article on EY:
"Yudkowsky identifies the big problem in AI research as being that there is no reason to assume an AI would give a damn about humans or what we care about in any way at all - not having a million years as a savannah ape or a billion years of evolution in its makeup. And he believes AI is imminent. As such, working out how to create a Friendly AI (one that won't kill us, inadvertently or otherwise) is the Big Problem he has taken as his own."
It needs work, but I hope does justice to the idea in trying to get it across to the general public, or at least people who are somewhat familiar with SF tropes.
This includes everything in my opinion. Goals, utility and value, economics, perspective...or was I supposed to come up with my own version? :-)
That's good, but the "unless it's programmed to not do that" bit should probably be expanded a little bit to make it clear that programming an AI "to not do that" is a lot harder than we intuitively expect.