Tem42 comments on Open thread, Oct. 19 - Oct. 25, 2015 - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (198)
I've been thinking about some of the issues with CEV. It's come up a few times that humanity might not have a coherent, non-contradictory set of values. And the question of how to come up with some set of values that best represents everyone.
It occurs to me that this might be a problem mathematicians have already solved, or at least given a lot of thought. In the form of voting systems. Voting is a very similar problem. You have a bunch of people you want to represent fairly, and you need to select a leader that best represents their interests.
My favorite alternative voting system is the Condorcet Method. Basically it compares each candidate in a 1v1 election, and selects the candidate that would have won every single election.
It is possible for there not to be a Condorcet winner. If the population has circular preferences. Candidate A > Candidate B > C > A... Like a rock paper scissors thing.
To solve this there are a number of methods developed to select the best compromise. My favorite is Minimax. It selects the candidate who's greatest loss is the least bad. I think that's the most desirable way to pick a winner, and it's also super simple.
There are some differences. Instead of a leader, we want the best set of values and policies for the AI to follow. And there might not be a finite set of candidates, but an infinite number of possibilities. And actually voting might be impractical. Instead an AI might have to predict what you would have voted, if you knew all the arguments and had much time to think about it and come to a conclusion. But I think it can still be modeled as a voting problem.
Now this isn't actually something we need to figure out now. If we somehow had an FAI, we could probably just ask it to come up with the most fair way of representing everyone's values. We probably don't need to hardcode these details.
The bigger issue is why would the person or group building the FAI even bother to do this? They could just take their own CEV and ignore everyone elses. And they have every incentive to do this. It might even be significantly simpler than trying to do a full CEV of humanity. So even if we do solve FAI, humanity is probably still screwed.
EDIT: After giving it some more thought, I'm not sure voting systems are actually desirable. The whole point of voting is that people can't be trusted to just specify their utility functions. The perfect voting system would be for each person to give a number to each candidate based on how much utility they'd get from them being elected. But that's extremely susceptible to tactical voting.
However with FAI, it's possible we could come up with some way of keeping people honest, or peering into their brains and getting their true value function. That adds a great deal of complexity though. And it requires trusting the AI to do a complex, arbitrary, and subjective task. Which means you must have already solved FAI.
If I were God of the World, I would model the problem as more of a River Crossing Puzzle. How do you get things moving along when everyone on the boat wants to kill each other? Segregation! Resettling humanity mapped over a giant Venn diagram is trivial once we are all uploaded, but it also runs into ethical problems; just as voting and enacting the will of the majority (or some version thereof) is problematic, so is setting up the world so that the oppressor and the oppressee will never be allowed meet. However, in my experience people are much happier with rules like "you can't go there" and much less happier with rules like "you have to do what that guy wants". This is probably due to our longstanding tradition of private property.
This makes some assumptions as to what the next world will look like, but I think that it is a likely outcome -- it is always much easier to send the kids to their rooms than to hold a family court, and I think a cost/benefit analysis would almost surely show that it is not worth trying to sort out all human problems as one big happy group.
Of course, this assumes that we don't do something crazy like include democracy and unity of the human race as terminal values.
This puts me in mind of Eliezer's "Failed Utopia #4-2".