You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

ciphergoth comments on Approval-directed agents - Less Wrong Discussion

9 Post author: paulfchristiano 12 December 2014 10:38PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (22)

You are viewing a single comment's thread.

Comment author: ciphergoth 14 December 2014 04:36:49PM 3 points [-]

This has great potential, thanks! But wouldn't Alfred be motivated to present to virtual Hugh whatever stimulus resulted in vH's selecting the highest approval response, even if that means eg hypnosis, brainwashing? I don't see how "turtles all the way down" can solve this, because every level can solve the problem for the level above but finds the problem on its own level.

Comment author: paulfchristiano 15 December 2014 07:02:24AM *  2 points [-]

You only have trouble if there is a goal-directed level beneath the lowest approval-directed level. The idea is to be approval-directed at the lowest levels where it makes sense (and below that you are using heuristics, algorithms, etc., in the same way that a goal-directed agent eventually bottoms out with useful heuristics or algorithms).