Lumifer comments on Open thread, Oct. 10 - Oct. 16, 2016 - Less Wrong
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (115)
I'm saying that if you can't recognize Friendliness (and I don't think you can), trying to build a FAI is pointless as you will not be able to answer "Is it Friendly?" even when looking at it.
So if you can't build a supervised model, you think going to unsupervised learning will solve your problems? The quote I gave you is part of human values -- humans do value triumph over their enemies. Evolution taught humans to eliminate competition, it taught them to be aggressive and greedy -- all human values. Why do you think your values will be preferred by the AI to values of, say, ISIS or third-world Maoist guerrillas? They're human, too.
Why do I need to recognize Friendliness to build an FAI? I only need to know that the process used to construct it results in a friendly AI. Trying to inspect the weights of a complex neural network (or whatever) is pointless as I stated earlier. We haven't the slightest idea how alphaGo's net really works, but we can trust it to beat the best Go champions.
Evolution also taught humans to be cooperative, empathetic, and kind.
Really your objection seems to be the whole point of CEV. A CEV wouldn't just include the values of ISIS members, but also their victims. And it would be extrapolated, to not just be their current opinions on things, but what their opinions would be if they knew more. Their values if they had more time to think about and consider issues. With those two conditions, the negative parts of human values are entirely eliminated.
This amounts to saying "because I'm right and once everyone gets to know reality better, they'll figure out I'm right."
In reality they will also figure out the places where you are wrong, and there will be many of them.
I'm not claiming that at all. I may be wrong about many things. It's irrelevant.
It is not irrelevant. You said, "With those two conditions, the negative parts of human values are entirely eliminated." That certainly meant that things like ISIS opinions would be eliminated. I agree in that particular case, but there are many other things that you would consider negative which will not be eliminated. I can probably guess some of them, although I won't do that here.
See my other comment for more clarification on how CEV would eliminate negative values.
I read that. You say there, "Your stated example was ISIS. ISIS is so bad because they incorrectly believe... If they knew all the arguments for and against religion, then their values would be more like ours." As I said, I agree with you in that case. But you are indeed saying, "it is because I am right and when they know better they will know I was right." And that will not always be true, even if it is true in that case.
I never claimed I am right about everything. I don't need to be right about everything. I would love to have an AI show me what I am wrong about and show me the perfect set of values.
And most importantly, I'm saying that this process would result in the optimal set of values for everyone. Do you disagree?
You are still facing the same problem. Given that you can't recognize friendliness, how will you create or choose a process which will build a FAI? Would you be able to answer "Will it be friendly?" by looking at the process?
That doesn't make much sense. What do you mean by "negative" and from which point of view? If from the point of view of the AI, that's just a trivial tautology. If from the point of view of (at least some) humans, this seems to be not so.
In general, do you treat morals/values as subjective or objective? If objective, the whole "if they knew more" part is entirely unnecessary: you're discovering empirical reality, not consulting with people on what do they like. And subjectivism here, of course, makes the whole idea of CEV meaningless.
Also, I see no evidence to support the view that as people know more, their morals improve, for pretty much any value of "improve".
You are literally asking me to solve the FAI problem right here and now. I understand that FAI is a very hard problem and I don't expect to solve it instantly. Just because a problem is hard, doesn't mean it can't have a solution.
First of all let me adopt some terminology from Superintelligence. I think FAI requires solving two somewhat different problems. Value Learning and Value Loading.
You seem to think Value Learning is the hard problem, getting an AI to learn what humans actually want. I think that's the easy problem, and any intelligent AI will form a model of humans and understand what we want. Getting it to care about what we want seems like the hard problem to me.
But I do see some promising ideas to approach the problem. For instance have AIs that predict what choices a human would make in each situation. So you basically get an AI which is just a human, but sped up a lot. Or have an AI which presents arguments for and against each choice, so that humans can make more informed choices. Then it could predict what choice a human would make after hearing all the arguments, and do that.
More complicated ideas were mentioned in Superintelligence. I like the idea of "motivational scaffolding".Somehow train an AI that can learn how the world works and can generate an "interpretable model". Like e.g. being able to understand English sentences and translate their meanings to representations the AI can use. Then you can explicitly program a utility function into the AI using its learned model.
From your point of view. You gave me examples of values which you consider bad, as an argument against FAI. I'm showing you that CEV would eliminate these things.
Your stated example was ISIS. ISIS is so bad because they incorrectly believe that God is on their side and wants them to do the things they do. That the people that die will go to heaven, so loss of life isn't so bad. If they were more intelligent, informed, and rational... If they knew all the arguments for and against religion, then their values would be more like ours. They would see how bad killing people is, and that their religion is wrong.
The second thing CEV does is average everyone's values together. So even if ISIS really does value killing people, their victims value not being killed even more. So a CEV of all of humanity would still value life, even if evil people's values are included. Even if everyone was a sociopath, their CEV would still be the best compromise possible, between everyone's values.