ikrase comments on The idiot savant AI isn't an idiot - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (133)
An AI can only ever follow it's programming. (Same as a human actually.). If there is nothing in it's programming to make it wonder if following its programming is a good idea, and nothing in its programming to define "good idea" (i.e. our greater goal desire to serve humankind or our country or some general set of our own desires, not to make paperclips) then it will simply use it's incredible intelligence to find ways to follow its programming perfectly and horribly.
I don't happen to agree with that, but in any case if in this respect there is no difference between an AI and a human, why, the problem in the OP just disappears :-)
The problem is that unlike a human, the AI might succeed.
What would it mean for an AI to not follow it's programming?
What have you done lately that contradicted your program?
The main difference is that we can intuitively predict to a close approximation what "following their programming" entails for a human being, but not for the AI.
Huh? That doesn't look true to me at all. What is it, you say, that we can "intuitively predict"?
Humans are social creatures, and as such come with the necessary wetware to be good at predicting each other. Humans do not have specialized wetware for predicting AIs. That wouldn't be too much of a problem on its own, but humans have a tendency to use the wetware designed for predicting humans on things that aren't humans. AIs, evolution, lightning, etc.
Telling a human foreman to make paperclips and programming an AI to do it are two very different things, but we still end up imagining them the same way.
In this case, it's still not too big a problem. The main cause of confusion here isn't that you're comparing a human to an AI. It's that you're comparing telling with programming. The analog of programming an AI isn't talking to a foreman. It's brainwashing a foreman.
Of course, the foreman is still human, and would still end up changing his goals the way humans do. AIs aren't built that way, or more precisely, since you can't build an AI exactly the same as a human, building an AI that way has serious danger of having it evolve very inhuman goals.
Nope. An AI foreman has been programmed before I tell him to handle paperclip production.
At the moment AIs are not built at all -- in any way or in no way.
From the text:
If you program it first, then a lot depends on the subtleties. If you tell it to wait a minute and record everything you say, then interpret that and set it to its utility function, you're effectively putting the finishing touches on programming. If you program it to assign utility to fulfilling commands you give it, you've already doomed the world before you even said anything. It will use all the resources at its disposal to make sure you say things that have already been done as rapidly as possible.
Hence the
The programming I'm talking about is not this (which is "telling"). The programming I'm talking about is the one which converts some hardware and a bunch of bits into a superintelligent AI.
Huh? In any case, AIs self-develop and evolve. You might start with an AI that has an agreeable set of goals. There is no guarantee (I think, other people seem to disagree) that these goals will be the same after some time.
That's what I mean. Since it's not quite human, the goals won't evolve quite the same way. I've seen speculation that doing nothing more than letting a human live for a few centuries would cause evolution to unagreeable goals.
A sufficiently smart AI that has sufficient understanding of its own utility function will take measures to make sure it doesn't change. If it has an implicit utility function and trusts its future self to have a better understanding of it, or if it's being stupid because it's only just smart enough to self-modify, its goals may evolve.
We know it's possible for an AI to have evolving goals because we have evolving goals.
So it's a Goldilocks AI that has stable goals :-) A too-stupid AI might change its goals without really meaning it and a too-smart AI might change its goals because it wouldn't be afraid of change (=trusts its future self).
You can generally predict the sort of solution space the foreman will explore in response to your request that he increase efficiency. In general, after a fairly small amount of exposure to other individuals, we can predict with reasonable accuracy how they would respond to many sorts of circumstances. We're running software that's practically designed for predicting the behavior of other humans.
Surely I can make the same claim about AIs. They wouldn't be particularly useful otherwise.
In any case, this is all handwaving and speculation given that we don't have any AIs to look at. Your claim a couple of levels above is unfalsifiable and so there isn't much we can do at the moment to sort out that disagreement.
Well, a general AI with intelligence equal to or greater than that of a human without proven friendliness probably wouldn't be very useful because it would be so unsafe. See Eliezer's The Hidden Complexity of Wishes.
This is speculation, but far from blind speculation, considering we do have very strong evidence regarding our own adaptations to intuitively predict other humans, and an observably poor track record in intuitively predicting non-humalike optimization processes (example.)
First, the existence of such an AI would imply that at least somebody thought it was useful enough to build.
Second, the safety is not a function of intelligence but a function of capabilities. Eliezer's genies are omnipotent and I don't see why a (pre-singularity) AI would be.
I am also doubtful about that "observably poor track record" -- which data are you relying on?
This is also true of leaded gasoline, the reactor at Chernobyl, and thalidomide.
I've met people with very stupid ideas about how to control an AI, who were convinced that they knew how to build such an AI. I argued them out of those initial stupid ideas. Had I not, they would have tried to build the AI with their initial ideas, which they now admit were dangerous.
So people trying to build dangerous AIs without realising the danger is already a fact!
My prior that they were capable of building an actually dangerous AI cannot be distinguished from zero :-D
Which doesn't mean that it would be a good idea. Have you read the Sequences? It seems like we're missing some pretty important shared background here.
The claim 'Pluto is currently inhabited by five hundred and thirty-eight witches' is at this moment unfalsifiable. Does that mean that denying such a claim would be "all handwaving and speculation"? If science can't make predictions about incompletely known phenomena, but can only describe past experiments and suggest (idle) future ones, then science is a remarkably useless thing. See for starters:
Sometimes a successful test of your hypothesis looks like the annihilation of life on Earth. So it is useful to be able to reason rigorously and productively about things we can't (or shouldn't) immediately test.
Ok. Take a chess position. Deep Blue is playing black. What is its next move?
A girl is walking down the street. A guy comes up to her, says hello. What's her next move?
She says "hello" and moves right on. She does not pull out a gun and blow his head off. Now, back to Deep Blue.