You're looking at Less Wrong's discussion board. This includes all posts, including those that haven't been promoted to the front page yet. For more information, see About Less Wrong.

IlyaShpitser comments on [link] New essay summarizing some of my latest thoughts on AI safety - Less Wrong Discussion

14 Post author: Kaj_Sotala 01 November 2015 08:07AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (27)

You are viewing a single comment's thread. Show more comments above.

Comment author: Wei_Dai 02 November 2015 07:12:25AM *  7 points [-]

Few people, when learning their values in childhood, ended up considering examples such as this one and explicitly learning that they were wrong. Yet the persuasive power of that example comes from most people instantly reject the desirability of the dopamine drip scenario when it’s suggested to them.

I for one don't "instantly reject" the desirability of this scenario. I think it's a difficult philosophy problem as to whether dopamine drip is desirable or not. My worry is that either the AI will not be as uncertain as I am about it, or it will not handle or resolve the normative uncertainty in the same way as I would or should.

Today's machine learning algorithms tend to be unreasonably certain (and wrong) about inputs very different from their training data, but that is perhaps just due to machine learning researchers currently focusing mostly on commercial settings where inputs are rarely very different from training data, and there aren't terrible consequences for getting things wrong. So maybe we can expect this to improve in the future as researchers start to focus more on safety.

But even if we manage to build an AI that is properly uncertain about whether something like the dopamine drip scenario is good or bad, how do we get it to resolve its uncertainty in the right way, especially if its creators/owners are also uncertain or possibly wrong so it can't just ask? Resolving the uncertainty incorrectly or getting the uncertainty permanently frozen into its utility function seem to be two big risks here. So I worry just as much about the reverse maverick nanny scenario, where we eventually, after centuries of philosophical progress, figure out that we actually do want to be put on dopamine drips, but the AI says "Sorry, I can't let you do that."

Comment author: IlyaShpitser 02 November 2015 03:14:18PM *  3 points [-]

Today's machine learning algorithms tend to be unreasonably certain (and wrong) about inputs very different from their training data

Read about covariate shift. (More generally ML people are getting into systematic biases now, including causal inference, in a big way).


This has little to do with AGI, though.