You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

abramdemski comments on Toy model for wire-heading [EDIT: removed for improvement] - Less Wrong Discussion

2 Post author: Stuart_Armstrong 09 October 2015 03:45PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (1)

You are viewing a single comment's thread.

Comment author: abramdemski 09 October 2015 07:43:48PM 3 points [-]

I agree with your point as stated, but I think a sharper distinction between utility-maximizing and reward-maximizing reveals more alternatives.

A reward-maximizing agent attempts to predict A; D maximizes this predicted future A.

A utility-maximizing agent has direct access to A; D applies A to evaluate possible futures, and maximizes A.

In the first case, a superintelligent D would want to wrestle control of A and modify it.

In the second case, when D thinks about the planned modification of A, it evaluates this possible future using the current A. It sees that the current A does not value this future particularly highly. Therefore, it does not wirehead.