Liron comments on Counterfactual Mugging v. Subjective Probability - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (32)
Groan! Of all the Omega crap, this is the craziest. Can anyone explain to me, why should anyone ever contemplate this impossible scenario? Don't just vote down.
A subproblem of Friendly AI, or at least a similar problem, is the challenge of proving that properties of an algorithm are stable under self-modification. If we don't identify a provably optimal algorithm for maximizing expected utility in decision-dependent counterfactuals, it's hard to predict how the AI will decide to modify its decision procedure, and it's harder to prove invariants about it.
Also, if someone else builds a rival AI, you don't want it to able to trick your AI into deciding to self-destruct by setting up a clever Omega-like situation.
If we can predict to how an AI would modify itself, why don't we just write an already modified AI?
Because the point of a self-modifying AI is that it will be able to self-modify in situations we don't anticipate. Being able to predict its self-modification in principle is useful precisely because we can't hard-code every special case.