Less Wrong is a community blog devoted to refining the art of human rationality. Please visit our About page for more information.

eli_sennesh comments on Siren worlds and the perils of over-optimised search - Less Wrong

27 Post author: Stuart_Armstrong 07 April 2014 11:00AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (411)

You are viewing a single comment's thread. Show more comments above.

Comment author: [deleted] 09 April 2014 08:08:05AM 0 points [-]

That looks very similar to what I was writing about, though I've tried to be rather more formal/mathematical about it instead of coming up with ad-hoc notions of "human", "behavior", "perception", "belief", etc. I would want the learning algorithm to have uncertain/probabilistic beliefs about the learned utility function, and if I was going to reason about individual human minds I would rather just model those minds directly (as done in Indirect Normativity).