Stuart_Armstrong comments on Proper value learning through indifference - Less Wrong

16 Post author: Stuart_Armstrong 19 June 2014 09:39AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (50)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 22 June 2014 08:47:51PM 1 point [-]

By specifying what counts as an allowed change in U, and making the agent in to a U maximiser. Then, just as standard maximises defend their utilities, it should defend U(un clubbing the update, and only that update)