You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

SteveG comments on Recent AI safety work - Less Wrong Discussion

20 Post author: paulfchristiano 30 December 2014 06:19PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (6)

You are viewing a single comment's thread. Show more comments above.

Comment author: SteveG 01 January 2015 10:23:08PM 0 points [-]

In addition to determining whether an action would be approved using a priori reasoning, an approval-directed AI could also reference a large database of past actions which have either been approved or disapproved.

Alternatively, in advance of ever making any real-world decision, the approval-directed AI could generate example scenarios and propose actions to people deemed effective moral reasoners many thousands of times. Their responses would greatly assist the system in constructing a model of whether an action is approvable, and by whom.

A lot of approval data could be created fairly readily. The AI can train on this data.