Vaniver comments on Debunking Fallacies in the Theory of AI Motivation - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (343)
This is just a placeholder: I will try to reply to this properly later.
Meanwhile, I only want to add one little thing.
Don't forget that all of this analysis is supposed to be about situations in which we have, so to speak "done our best" with the AI design. That is sort of built into the premise. If there is a no-brainer change we can make to the design of the AI, to guard against some failure mode, then is assumed that this has been done.
The reason for that is that the basic premise of these scenarios is "We did our best to make the thing friendly, but in spite of all that effort, it went off the rails."
For that reason, I am not really making arguments about the characteristics of a "generic" AI.
Thanks, and take your time!
I feel like this could be an endless source of confusion and disagreement; if we're trying to discuss what makes airplanes fly or crash, should we assume that engineers have done their best and made every no-brainer change? I'd rather we look for the underlying principles, we codify best practices, we come up with lists and tests.
If you are in the business of pointing out to them potential problems they are not aware of, then yes, because they can be assumed to be aware of no brainer issues.
MIRI seeks to point out dangers in AI that aren't the result of gross incompetence or deliberate attempts to weaponise AI: it's banal to point out that these could read to danger.