TheAncientGeek comments on Debunking Fallacies in the Theory of AI Motivation - LessWrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (343)
Now, you are trying to put your finger on a difference between two versions of the DLI that you think I have supplied.
You have paraphrased the two versions as:
and
I think you are seeing some valid issues here, having to do with how to characterize what exactly it is that this AI is supposed to be 'thinking' when it goes through this process.
I have actually thought about that a lot, too, and my conclusion is that we should not beat ourselves up trying to figure out precisely what the difference might be between these nuanced versions of the idea, because the people who are proposing this idea in the first place have not themselves been clear enough about what is meant.
For example, you talked about "Doing dumb things because you think are correct" .... but what does it mean to say that you 'think' that they are correct? To me, as a human, that seems to entail being completely unaware of the evidence that they might not be correct ("Jill took the ice-cream from Jack because she didn't know that it was wrong to take someone else's ice-cream."). The problem is, we are talking about an AI, and some people talk as if the AI can run its planning engine, then feel compelled to obey the planning engine ... while at the same time being fully cognizant of evidence that the planning engine produced a crappy plan. There is no easy counterpart to that in humans (except for cognitive dissonance, and there we have a case where the human is capable of compartmentalizing its beliefs .... something that is not being suggested here, because we are not forced to make the AI do that). So, since the AI case does not map on to the human case, we are left in a peculiar situation where it is not at all clear that the AI really COULD do what is proposed, and still operate as a successful intelligence.
Or, more immediately, it is not at all clear that we can say about that AI "It did a dumb thing because it 'thought' it was correct."
I should add that in both of my quoted descriptions of the DLI that you gave, I see no substantial difference (beyond those imponderables I just mentioned) and that in both cases I was actually trying to say something very close to the second paraphrase that you gave, namely:
And, don't forget: I am not saying that such an AI is viable at all! Other people are suggesting some such AI, and I am arguing that the design is so logically incoherent that the AI (if it could be made to exist) would call attention to that problem and suggest means to correct it.
Anyhow, the takeway from this comment is: the people who talk about an AI that exhibits this kind of behavior are actually suggesting a behavior that they have not really thought through carefully, so as a result we can find ourselves walking into a minefield if we go and try to clean up the mess that they left.
They are clear that they don't mean AIs rigid behaviour is the result of it assessing its own inferrential processes as infallible ... that is what the controversy is all about..
That is just what The Genie Knows but doesn't Care is supposed to answer. I think it succeeds in showing that a fairly specific architecture would behave that way, but fails in it's intended goal of showing that this behaviour is universal or likely.