Liron comments on Counterfactual Mugging v. Subjective Probability - Less Wrong

1 Post author: MBlume 20 July 2009 04:31PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (32)

You are viewing a single comment's thread. Show more comments above.

Comment author: Liron 20 July 2009 11:37:24PM 1 point [-]

A subproblem of Friendly AI, or at least a similar problem, is the challenge of proving that properties of an algorithm are stable under self-modification. If we don't identify a provably optimal algorithm for maximizing expected utility in decision-dependent counterfactuals, it's hard to predict how the AI will decide to modify its decision procedure, and it's harder to prove invariants about it.

Also, if someone else builds a rival AI, you don't want it to able to trick your AI into deciding to self-destruct by setting up a clever Omega-like situation.

Comment author: CannibalSmith 21 July 2009 08:43:11AM 0 points [-]

If we can predict to how an AI would modify itself, why don't we just write an already modified AI?

Comment author: thomblake 21 July 2009 06:08:32PM 1 point [-]

Because the point of a self-modifying AI is that it will be able to self-modify in situations we don't anticipate. Being able to predict its self-modification in principle is useful precisely because we can't hard-code every special case.