eli_sennesh comments on The Value Learning Problem - LessWrong

16 Post author: So8res 29 January 2015 06:23PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (37)

You are viewing a single comment's thread.

Comment author: [deleted] 31 January 2015 01:37:54PM 3 points [-]

Was the existing literature on preference learning covered or critiqued in this paper?

Comment author: So8res 01 February 2015 05:41:08PM 2 points [-]

Not really. Reinforcement learning is mentioned, and inverse reinforcement learning is briefly discussed, but I'm not aware of much other preference learning literature that is relevant to this particular type of value learning (highly advanced systems learning all of human values). (Exception: Kaj's recent paper, which I'll shortly add as a citation.)

Comment author: [deleted] 04 February 2015 02:08:49PM 3 points [-]

I'm not aware of much other preference learning literature that is relevant to this particular type of value learning

I can't imagine there isn't a single paper out there in the literature about supervised learning of VNM-style utility functions over rich, or even weak, hypothesis spaces.

Here's a trivial example pulled off one minute's Googling. It "counts" because the kernel trick is sufficiently rich to include all possible functions over Hilbert spaces.

I do think that if you've researched this more thoroughly than I have (I'd bet you have, since it's your job), the paper really ought to include a critique of the existing literature, so as to characterize what sections of the unevaluated-potential-solution tree for the value-learning problem should be explored first.