x
IRL 6/8: Querying the human: Active Reward Learning — LessWrong