John_Maxwell_IV comments on [draft] Concepts are Difficult, and Unfriendliness is the Default: A Scary Idea Summary - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (39)
At the risk of losing my precious karma, I'll play the devil's advocate and say I disagree.
First some definitions: "Friendly" (AI), according to Wikipedia, is one that is beneficial to humanity (not a human buddy or pet). "General" in AGI means not problem-specific (narrow AI).
My counterexample is an AI system that lacks any motivations, goals or actuators. Think of an AIXI system (or, realistically, a system that approximates it), and subtract any reward mechanisms. It just models its world (looking for short programs that describe its input). You could use it to make (super-intelligent) predictions about the future. This seems clearly beneficial to humanity (until it falls into malicious human hands, but that's besides the argument you are making)
That would make (human[s] + predictor) in to an optimization process that was powerful beyond the human[s]'s ability to steer. You might see a nice looking prediction, but you won't understand the value of the details, or the value of the means used to achieve it. (Which would be called trade-offs in a goal directed mind, but nothing weighs them here.)
It also won't be reliable to look for models in which you are predicted to not hit the Emergency Regret Button As that may just find models in which your regret evaluator is modified.
Is a human equipped with Google an optimization process powerful beyond the human's ability to steer?
Tell me from China.