You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

Slider comments on Conservation of expected moral evidence, clarified - Less Wrong Discussion

11 Post author: Stuart_Armstrong 20 June 2014 10:28AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (10)

You are viewing a single comment's thread.

Comment author: Slider 23 June 2014 07:00:49AM 1 point [-]

A child will eventually learn the independence. It's a very different thing to hardcode assume the independence from the start. Values are also complex and there might not be a natural line on what is natural to care about. If I take the the morality of the agent to be independent of it's decisions really far it will mean the AI believes it's a good boy or a bad boy already no matter what it does (and it just doesn't know which). This seems to be opposite what moral behavior is about. There is also no telling that the valued actions need to simple concepts. After all cake is good unless diabetes and death might not be all bad if euthanasia.

Also if I am inconsistent or plain out make an error on assigning the labels I would rather have the AI have a dialogue back to me about it than make silently a really convoluted utility function.

We could off course think that a value loader is a two step process where the values loaded are not taken as is as the actual utility function but modified to make sense in a "non-cheating way". That is the actual utility function always contains terms about moral fairness even if not explicitly input. This reduces the amount of values softcoded as meta-morality is hardcoded. But this layering also solves the original problem. The agent might evaluate the new situation with lower expected utility and would choose not to go there - if it had a choice. But by layering we are taking the choice away so the wrong valuation doesn't matter. By enforcing the hand-inputted conservation of moral evidence we tie the AIs hand to form a moral strategy. "We know better" before hand.

Comment author: Stuart_Armstrong 23 June 2014 01:46:57PM 0 points [-]

An interesting point, hinting that my approach at moral updating ( http://lesswrong.com/lw/jxa/proper_value_learning_through_indifference/ ) may be better than I supposed.

Comment author: Slider 25 June 2014 07:26:44AM 1 point [-]

I was more getting to that it narrows down the problem instead of generalising it. It reduces the responsibilities of the AI and widens those of humans. If you solved this problem you would only get up to the most virtous human (which isn't exactly bad). Going beyond would require ethics competency that would have to be added as we are tying it's hands in this department.

Comment author: Stuart_Armstrong 25 June 2014 10:30:20AM 0 points [-]

I take the point in practice, but there's no reason we couldn't design something to follow a path towards ultra-ethicshood that had the conservation property. For instance, if we could implement "as soon as you know your morals would change, then change them", this would give us a good part of the "conservation" law.