Stuart_Armstrong comments on An Oracle standard trick - Less Wrong

4 Post author: Stuart_Armstrong 03 June 2015 02:17PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (33)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 05 June 2015 11:51:36AM 1 point [-]

The "act as if it doesn't believe its messages will be read" is part of its value function, not its decision theory. So we are only requiring the value function to be stable over self improvement.

Comment author: Lumifer 05 June 2015 02:31:22PM 0 points [-]

Why is that? The value function tells you what is important, but the "act" part requires decision theory.

Comment author: Stuart_Armstrong 05 June 2015 04:17:20PM 0 points [-]

What I mean is that I haven't wired the decision theory to something odd (which might be removed by self improvement), just chosen a particular value system (which has much higher chance of being preserved by self improvement).