If it's worth saying, but not worth its own post, then it goes here.
Notes for future OT posters:
1. Please add the 'open_thread' tag.
2. Check if there is an active Open Thread before posting a new one. (Immediately before; refresh the list-of-threads page before posting.)
3. Open Threads should start on Monday, and end on Sunday.
4. Unflag the two options "Notify me of new top level comments on this article" and "
You are literally asking me to solve the FAI problem right here and now. I understand that FAI is a very hard problem and I don't expect to solve it instantly. Just because a problem is hard, doesn't mean it can't have a solution.
First of all let me adopt some terminology from Superintelligence. I think FAI requires solving two somewhat different problems. Value Learning and Value Loading.
You seem to think Value Learning is the hard problem, getting an AI to learn what humans actually want. I think that's the easy problem, and any intelligent AI will form a model of humans and understand what we want. Getting it to care about what we want seems like the hard problem to me.
But I do see some promising ideas to approach the problem. For instance have AIs that predict what choices a human would make in each situation. So you basically get an AI which is just a human, but sped up a lot. Or have an AI which presents arguments for and against each choice, so that humans can make more informed choices. Then it could predict what choice a human would make after hearing all the arguments, and do that.
More complicated ideas were mentioned in Superintelligence. I like the idea of "motivational scaffolding".Somehow train an AI that can learn how the world works and can generate an "interpretable model". Like e.g. being able to understand English sentences and translate their meanings to representations the AI can use. Then you can explicitly program a utility function into the AI using its learned model.
From your point of view. You gave me examples of values which you consider bad, as an argument against FAI. I'm showing you that CEV would eliminate these things.
Your stated example was ISIS. ISIS is so bad because they incorrectly believe that God is on their side and wants them to do the things they do. That the people that die will go to heaven, so loss of life isn't so bad. If they were more intelligent, informed, and rational... If they knew all the arguments for and against religion, then their values would be more like ours. They would see how bad killing people is, and that their religion is wrong.
The second thing CEV does is average everyone's values together. So even if ISIS really does value killing people, their victims value not being killed even more. So a CEV of all of humanity would still value life, even if evil people's values are included. Even if everyone was a sociopath, their CEV would still be the best compromise possible, between everyone's values.
No, I'm asking you to specify it. My point is that you can't build X if you can't even recognize X.
Learning what humans want is pretty easy. However it's an inconsistent mess which involves many things contemporary people find unsavory. Making it all coherent and formulating a (single) policy on the basis of this mess is the hard part.
... (read more)