John_Maxwell_IV comments on Stupid Questions Open Thread - Less Wrong Discussion
You are viewing a comment permalink. View the original post to see all comments and the full post content.
You are viewing a comment permalink. View the original post to see all comments and the full post content.
Comments (265)
If the SIAI engineers figure out how to construct friendly super-AI, why would they care about making it respect the values of anyone but themselves? What incentive do they have to program an AI that is friendly to humanity, and not just to themselves? What's stopping LukeProg from appointing himself king of the universe?
Now that I understand your question better, here's my answer:
Let's say the engineers decide to make the AI respect only their values. But if they were the sort of people who were likely to do that, no one would donate money to them. They could offer to make the AI respect the values of themselves and their donors, but that would alienate everyone else and make the lives of themselves and their donors difficult. The species boundary between humans and other living beings is a natural place to stop expanding the circle of enfranchised agents.
This seems to depend on the implicit assumption that their donors (and everyone else powerful enough to make their lives difficult) don't mind having the values of third parties respected.
If some do mind, then there's probably some optimally pragmatic balancing point short of all humans.
Probably, but defining that balancing point would mean a lot of bureaucratic overhead to determine who to exclude or include.
Can you expand on what you mean by "bureaucratic" here?
Are people going to vote on whether someone should be included? Is there an appeals process? Are all decisions final?
OK, thanks.
It seems to me all these questions arise for "include everyone" as well. Somewhere along the line someone is going to suggest "don't include fundamentalist Christians", for example, and if I'm committed to the kind of democratic decision process you imply, then we now need to have a vote, or at least decide whether we have a vote, etc. etc, all of that bureaucratic overhead.
Of course, that might not be necessary; I could just unilaterally override that suggestion, mandate "No, we include everyone!", and if I have enough clout to make that stick, then it sticks, with no bureaucratic overhead. Yay! This seems to more or less be what you have in mind.
It's just that the same goes for "Include everyone except fundamentalist Christians."
In any case, I don't see how any of this cumbersome democratic machinery makes any sense in this scenario. Actually working out CEV implies the existence of something, call it X, that is capable of extrapolating a coherent volition from the state of a group of minds. What's the point of voting, appeals, etc. when that technology is available? X itself is a better solution to the same problem.
Which implies that it's possible to identify a smaller group of minds as the Advisory Board and say to X "Work out the Advisory Board's CEV with respect to whose minds should be included as input to a general-purpose optimizer's target definition, then work out the CEV of those minds with respect to the desired state of the world."
Then anyone with enough political clout to get in my way, I add to the Advisory Board, thereby ensuring that their values get taken into consideration (including their values regarding whose values get included).
That includes folks who think everyone should get an equal say, folks who think that every human should get an equal say, folks who think that everyone with more than a certain threshold level of intelligence and moral capacity get a say, folks who think that everyone who agrees with them get a say, etc., etc. X works all of that out, and spits out a spec on the other side for who actually gets a say and to what degree, which it then takes as input to the actual CEV-extrapolating process.
This seems kind of absurd to me, but no more so than the idea that X can work out humanity's CEV at all. If I'm granting that premise for the sake of argument, everything else seems to follow.
There is no clear bright line determining who is or is not a fundamentalist Christian. Right now, there pretty much is a clear bright line determining who is or is not human. And that clear bright line encompasses everyone we would possibly want to cooperate with.
Your advisory board suggestion ignores the fact that we have to be able to cooperate prior to the invention of CEV deducers.
And you're not describing a process for how the advisory board is decided either. Different advisory boards may produce different groups of enfranchised minds. So your suggestion doesn't resolve the problem.
In fact, I don't see how putting a group of minds on the advisory board is any different than just making them the input to the CEV. If a person's CEV is that someone's mind should contribute to the optimizer's target, that will be their CEV regardless of whether it's measured in an advisory board context or not.
There is no clear bright line determining what is or isn't a clear bright line.
I agree that the line separating "human" from "non-human" is much clearer and brighter than that separating "fundamentalist Christian" from "non-fundamentalist Christian", and I further agree that for minds like mine the difference between those two lines is very important. Something with a mind like mine can work with the first distinction much more easily than with the second.
So what?
A mind like mine doesn't stand a chance of extrapolating a coherent volition from the contents of a group of target minds. Whatever X is, it isn't a mind like mine.
If we don't have such an X available, then it doesn't matter what defining characteristic we use to determine the target group for CEV extrapolation, because we can't extrapolate CEV from them anyway.
If we do have such an X available, then it doesn't matter what lines are clear and bright enough for minds like mine to reliably work with; what matters is what lines are clear and bright enough for systems like X to reliably work with.
I have confidence < .1 that either one of us can articulate a specification determining who is human that doesn't either include or exclude some system that someone included in that specification would contest the inclusion/exclusion of.
I also have confidence < .1 that, using any definition of "human" you care to specify, the universe contains no nonhuman systems I would possibly want to cooperate with.
Sure, but so does your "include all humans" suggestion. We're both assuming that there's some way the AI-development team can convincingly commit to a policy P such that other people's decisions to cooperate will plausibly be based on the belief that P will actually be implemented when the time comes; we are neither of us specifying how that is actually supposed to work. Merely saying "I'll include all of humanity" isn't good enough to ensure cooperation if nobody believes me.
I have confidence that, given a mechanism for getting from someone saying "I'll include all of humanity" to everyone cooperating, I can work out a way to use the same mechanism to get from someone saying "I'll include the Advisory Board, which includes anyone with enough power that I care whether they cooperate or not" to everyone I care about cooperating.
I said: "Then anyone with enough political clout to get in my way, I add to the Advisory Board." That seems to me as well-defined a process as "I decide to include every human being."
Certainly.
Can you say again which problem you're referring to here? I've lost track.
Absolutely agreed.
Consider the implications of that, though.
Suppose you have a CEV-extractor and we're the only two people in the world, just for simplicity.
You can either point the CEV-extractor at yourself, or at both of us.
If you genuinely want me included, then it doesn't matter which you choose; the result will be the same.
Conversely, if the result is different, that's evidence that you don't genuinely want me included, even if you think you do.
Knowing that, why would you choose to point the CEV-extractor at both of us?
One reason for doing so might be that you'd precommitted to doing so (or some UDT equivalent), so as to secure my cooperation. Of course, if you can secure my cooperation without such a precommitment (say, by claiming you would point it at both of us), that's even better.
Complicated or ambiguous schemes take more time to explain, get more attention, and risk folks spending time trying to gerrymander their way in instead of contributing to FAI.
I think any solution other than "enfranchise humanity" is a potential PR disaster.
Keep in mind that not everyone is that smart, and there are some folks who would make a fuss about disenfranchisement of others even if they themselves were enfranchised (and therefore, by definition, those they were making a fuss about would be enfranchised if they thought it was a good idea).
I agree there are potential ambiguity problems with drawing the line at humans, but I think the potential problems are bigger with other schemes.
I agree there are potential problems with credibility, but that seems like a separate argument.
It's not all or nothing. The more inclusive the enfranchisement, the more cooperation there will be in general.
With that scheme, you're incentivizing folks to prove they have enough political clout to get in your way.
Moreover, humans aren't perfect reasoning systems. Your way of determining enfranchisement sounds a lot more adversarial than mine, which would affect the tone of the effort in a big and undesirable way.
Why do you think that the right to vote in democratic countries is as clearly determined as it is? Restricting voting rights to those of a certain IQ or higher would be a politically unfeasible PR nightmare.
Again, this is a different argument about why people cooperate instead of defect. To a large degree, evolution hardwired us to cooperate, especially when others are trying to cooperate with us.
I agree that if the FAI project seems to be staffed with a lot of untrustworthy, selfish backstabbers, we should cast a suspicious eye on it regardless of what they say about their project.
Ultimately it probably doesn't matter much what their broadcasted intention towards the enfranchisement of those outside their group is, since things will largely come down to what their actual intentions are.
Is there? What about unborn babies? What about IVF fetuses? People in comas? Cryo-presevered bodies? Sufficiently-detailed brain scans?