You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Stuart_Armstrong comments on One weird trick to turn maximisers into minimisers - Less Wrong Discussion

1 Post author: Stuart_Armstrong 22 April 2016 04:47PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (9)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 26 April 2016 05:49:24PM 0 points [-]

I'm trying to implement value change (see eg http://lesswrong.com/lw/jxa/proper_value_learning_through_indifference/ ). The change from u to -u is the easiest example of such a change. The ideal - which probably can't be implemented in a standard utility function - is that it is a u-maximiser that's indifferent to becoming a -u maximiser, who's then indifferent to further change, etc...

Comment author: Luke_A_Somers 27 April 2016 11:51:24AM *  0 points [-]

Well, then, let's change from the example being Monday + to Tuesday - to Wednesday and all later times +, with it unable to actually affect paperclip counts on Tuesday, let's consider if we just have a transition from u+ on Monday, Tuesday, Wednesday +, with u- on Thursday and later times, and it already has all the infrastructure it needs.

In this case, it will see that it can get a + score by having paperclips monday through wednesday, but that any that it still has on Thursday will count against it.

So, it will build paperclips as soon as it learns of this pattern. It will make them have a low melting point, and it will build a furnace†. On Wednesday evening at the stroke of midnight, it will dump its paperclips into the furnace. Because all along, from the very beginning, it will have wanted there to be paperclips M-W, and not after then. And on Thursday it will be happy that there were paperclips M-W, but glad that there aren't now.

I think that the trick is getting it to submit to changes to its utility function based on what we want at that time, without trying to game it. That's going to be much harder.

† and, if it suspects that there are paperclips out in the wild, it will begin building machines to hunt them down, and iff it's Thursday or later, destroy them. It will do this as soon as it learns that it will eventually be a paperclip minimizer for long enough that it is worth worrying about.