Snowyowl comments on AI indifference through utility manipulation - Less Wrong

4 Post author: Stuart_Armstrong 02 September 2010 05:06PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (53)

You are viewing a single comment's thread. Show more comments above.

Comment author: Snowyowl 02 September 2010 09:40:00PM *  1 point [-]

U(E)=U(A) is what we desire. That is what the filter is designed to achieve: it basically forces the AI to act as though the explosives will never detonate (by considering the outcome of a successful detonation to be the same as a failed detonation). The idea is to ensure that the AI ignores the possibility of being blown up, so that it does not waste resources on disarming the explosives - and can then be blown up. Difficult, but very useful if it works.

The rest of the post is (once you wade through the notation) dealing with the situation where there are several different ways in which each outcome can be realised, and the mathematics of the utility filter in this case.

Comment author: Stuart_Armstrong 03 September 2010 11:00:06AM 0 points [-]

Exactly.