pragmatist comments on AI risk, new executive summary - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (76)
I am trying to use an outside view here, because I find the inside view too limiting. The best I can do is to construct a tower of comparisons between species vastly different in intelligence and conjecture that this tower does not end with humans on top, a Copernican principle, if you like. To use some drastically different pairing, if you agree that an amoeba can never comprehend fish, that fish can never comprehend chimps, that chimps can never understand humans, then there is no reason to stop there and proclaim that humans would understand whatever intelligence comes next.
Certainly language is important, and human language is much more evolved than that of other animals. There are parts of human language, like writing, which are probably inaccessible to chimps, no matter how much effort we put into teaching them and how patient we are. I can easily imagine that AGI would use some kind of "meta-language", because human language would simply be inadequate for expressing its goals, like the chimp language is inadequate for expressing human metaethics.
I do not know what this next step would be, no more than an intelligent chimp being able to predict that humans would invent writing. My mind as-is is too limited and I understand as much. An AGI would have to make me smarter first, before being able to explain what it means to me. Call it "human uplifting".
Yes, if you look through the tower of goals, more intelligent species have more complex goals.
It has not been mine. When someone smarter than I am behaves a certain way, they have to patiently explain to me why they do what they do. And I still only see the path they have taken, not the million paths they briefly considered and rejected along the way.
My prejudice tells me that when someone a few levels above mine tries to explain their goals and motivations to me in English, I may understand each word, but not the complete sentences. If you cannot relate to this experience, go to a professional talk on a subject you know nothing about. For example, a musician friend of mine who attended my PhD defense commented on what she said was a surreal experience: I was talking in English, and most of the words she knew, but most of what I said was meaningless to her. Certainly some of this gap can be patched to a degree, and after a decade or so of dedicated work by both sides, wrought with frustration and doubt, but I don't think if the gap is wide enough it can be bridged completely.
I find the line of thinking "we are humans, we are smart, we can understand the goals of even an incredibly smart AGI" to be naive, unimaginative and closed-minded, given that our experience is rife with counterexamples.
OK, but why not look at this tower another way. A fish is basically useless at explaining its goals to an amoeba. We are not in fact useless at explaining our goals to chimps. Human researchers are often able to convey simple goals to chimps, and then see if chimps will help them accomplish those goals, for instance. I am able to convey simple goals to my dog: I can convey to him some information about the kinds of things I dislike and the kinds of things I like.
So the gap in intelligence between fish and humans also seems to translate into a gap in ability to convey useful information about goals to creatures of lower intelligence. Humans are much better at communicating with less intelligent beings than fish or cattle or chimps are. Extrapolating this, you might expect a superintelligent AGI to be much much superior at communicating its goals (if it wants to). The line of thinking here is not so much "we are humans, we are smart, we can understand the goals of even an incredibly smart AGI"; it's "an incredibly smart AGI is incredibly smart, so it will be able to find effective strategies for communicating its goals to us if it so desires."
So it seems like naive extrapolation pulls in two separate directions here. On the one hand, the tower of intelligence seems to put limits on the ability of beings lower down to comprehend the goals of beings higher up. On the other hand, the higher up you go, the better beings at that level become at communicating their goals to beings lower down. Which one of these tendencies will win out when it comes to human-AGI interaction? Beats me. I'm pretty skeptical of naive extrapolation in this domain anyway, given Eliezer's point that major advances in optimization power are meta-level qualitative shifts, and so we shouldn't expect trends to be maintained across those shifts.
You are right that we are certainly able to convey a small simple subset of our goals, desires and motivations to some complex enough animals. You would probably also agree that most of what makes us human can never be explained to a dog or a cat, no matter how hard we try. We appear to them like members of their own species who sometimes make completely incomprehensible decisions they have no choice but put up with.
This is quite possible. It might give us its dumbed-down version of its 10 commandments which would look to us like an incredible feat of science and philosophy.
Right. An optimistic view is that we can understand the explanations, a pessimistic view is that we would only be able to follow instructions (this is not the most pessimistic view by far).
Indeed, we shouldn't. I probably phrased my point poorly. What I tried to convey is that because "major advances in optimization power are meta-level qualitative shifts", confidently proclaiming that an advanced AGI will be able to convey what it thinks to humans is based on the just-world fallacy, not on any solid scientific footing.