Stuart_Armstrong comments on Value learning: ultra-sophisticated Cake or Death - Less Wrong

9 Post author: Stuart_Armstrong 17 June 2014 04:36PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (15)

You are viewing a single comment's thread. Show more comments above.

Comment author: Stuart_Armstrong 19 June 2014 02:18:02PM 0 points [-]

I understand that bad news makes one sad but does that lead to rejecting bad news?

For standard Bayesian agents, no. But these value updating agents behave differently. Imagine if a human said to the AI "If I say good, you action was good, and that will be your values. If I say bad, it will be the reverse." Wouldn't you want to motivate it to say "good"?

Comment author: Slider 19 June 2014 07:55:43PM 0 points [-]

I have trouble seeing the difference as I think you can turn the variable value statements into empirical facts that map to a constant value. Say that cake->yummy->good, cake->icky->bad, death->icky->bad, death->yummy->good. Then the yummy->good connection could be questioned as a matter about the world and not about values. If a bayesian accepts sad news in that kind of world how come the value loader tries to shun them?

Comment author: Stuart_Armstrong 20 June 2014 10:33:30AM 0 points [-]
Comment author: [deleted] 19 June 2014 07:33:28PM 0 points [-]

Wouldn't you want to motivate it to say "good"?

I might be committing mind-projection here, but no. Data is data, evidence is evidence. Expected moral data is, in some sense, moral data: if the AI predicts with high confidence that I will say "bad", this ought to already be evidence that it ought not have done whatever I'm about to scold it for.

Comment author: Stuart_Armstrong 20 June 2014 10:33:17AM 0 points [-]