You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

solipsist comments on My research priorities for AI control - Less Wrong Discussion

17 Post author: paulfchristiano 06 December 2015 01:57AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (3)

You are viewing a single comment's thread.

Comment author: solipsist 07 December 2015 04:30:32AM *  2 points [-]

Minor naming feedback. You switched from calling something "supervised learning" to "reinforcement learning". The first images that come to my mind when I hear "reinforcement learning" are TD-Gammon and reward signals. So, when I read "reinforcement learning", I first think of a computer getting smarter through iterative navel-gazing, then think of a computer trying to wirehead itself, then stumble to the meaning I think you intend. I am a lay reader.