owencb comments on Learning to get things right first time - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (35)
Fair question.
My point is that if improving techniques could take you from (arbitrarily chosen percentages here) a 50% chance that an unfriendly AI would cause an existential crisis, to 25% chance that it would - you really didn't gain all that much, and the wiser course of action is still not to make the AI.
The actual percentages are wildly debatable, of course, but I would say that if you think there is any chance - no matter how small - of triggering ye olde existential crisis, you don't do it - and I do not believe that technique alone could get us anywhere close to that.
The ideas you propose in OP seem wise, and good for society - and wholly ineffective in actually stopping us from creating an unfriendly AI, The reasons are simply that the complexity defies analysis, at least by human beings. The fear is that the unfriendly arises from unintended design consequences, from unanticipated system effects rather than bugs in code or faulty intent
It's a consequence of entropy - there are simply far, far more ways for something to get screwed up than for it to be right. So unexpected effects arising from complexity are far, far more likely to cause issues than be beneficial unless you can somehow correct for them - planning ahead only will get you so far.
Your OP suggests that we might be more successful if we got more of it right "the first time". But - things this complex are not created, finished, de-novo - they are an iterative, evolutionary task. The training could well be helpful, but I suspect not for the reasons you suggested. The real trick is to design things so that when they go wrong - it still works correctly. You have to plan for and expect failure, or that inevitable failure is the end of the line.
I disagree that "you really didn't gain all that much" in your example. There are possible numbers such that it's better to avoid producing AI, but (a) that may not be a lever which is available to us, and (b) AI done right would probably represent an existential eucatastrophe, greatly improving our ability to avoid or deal with future threats.
I have an intellectual issue with using "probably" before an event that has never happened before, in the history of the universe (so far as I can tell).
And - if I am given the choice between slow, steady improvement in the lot of humanity (which seems to be the status quo), and a dice throw that results in either paradise, or extinction - I'll stick with slow steady, thanks, unless the odds were overwhelmingly positive. And - I suspect they are, but in the opposite direction, because there are far more ways to screw up than to succeed, and once the AI is out - you no longer have a chance to change it much. I'd prefer to wait it out, slowly refining things, until paradise is assured.
Hmm. That actually brings a thought to mind. If an unfriendly AI was far more likely than a friendly one (as I have just been suggesting) - why aren't we made of computronium? I can think of a few reasons, with no real way to decide. The scary one is "maybe we are, and this evolution thing is the unfriendly part..."