There is no clear bright line determining who is or is not a fundamentalist Christian.
There is no clear bright line determining what is or isn't a clear bright line.
I agree that the line separating "human" from "non-human" is much clearer and brighter than that separating "fundamentalist Christian" from "non-fundamentalist Christian", and I further agree that for minds like mine the difference between those two lines is very important. Something with a mind like mine can work with the first distinction much more easily than with the second.
So what?
A mind like mine doesn't stand a chance of extrapolating a coherent volition from the contents of a group of target minds. Whatever X is, it isn't a mind like mine.
If we don't have such an X available, then it doesn't matter what defining characteristic we use to determine the target group for CEV extrapolation, because we can't extrapolate CEV from them anyway.
If we do have such an X available, then it doesn't matter what lines are clear and bright enough for minds like mine to reliably work with; what matters is what lines are clear and bright enough for systems like X to reliably work with.
Right now, there pretty much is a clear bright line determining who is or is not human. And that clear bright line encompasses everyone we would possibly want to cooperate with.
I have confidence < .1 that either one of us can articulate a specification determining who is human that doesn't either include or exclude some system that someone included in that specification would contest the inclusion/exclusion of.
I also have confidence < .1 that, using any definition of "human" you care to specify, the universe contains no nonhuman systems I would possibly want to cooperate with.
Your advisory board suggestion ignores the fact that we have to be able to cooperate prior to the invention of CEV deducers.
Sure, but so does your "include all humans" suggestion. We're both assuming that there's some way the AI-development team can convincingly commit to a policy P such that other people's decisions to cooperate will plausibly be based on the belief that P will actually be implemented when the time comes; we are neither of us specifying how that is actually supposed to work. Merely saying "I'll include all of humanity" isn't good enough to ensure cooperation if nobody believes me.
I have confidence that, given a mechanism for getting from someone saying "I'll include all of humanity" to everyone cooperating, I can work out a way to use the same mechanism to get from someone saying "I'll include the Advisory Board, which includes anyone with enough power that I care whether they cooperate or not" to everyone I care about cooperating.
And you're not describing a process for how the advisory board is decided either.
I said: "Then anyone with enough political clout to get in my way, I add to the Advisory Board." That seems to me as well-defined a process as "I decide to include every human being."
Different advisory boards may produce different groups of enfranchised minds.
Certainly.
So your suggestion doesn't resolve the problem.
Can you say again which problem you're referring to here? I've lost track.
In fact, I don't see how putting a group of minds on the advisory board is any different than just making them the input to the CEV.
Absolutely agreed.
Consider the implications of that, though.
Suppose you have a CEV-extractor and we're the only two people in the world, just for simplicity.
You can either point the CEV-extractor at yourself, or at both of us.
If you genuinely want me included, then it doesn't matter which you choose; the result will be the same.
Conversely, if the result is different, that's evidence that you don't genuinely want me included, even if you think you do.
Knowing that, why would you choose to point the CEV-extractor at both of us?
One reason for doing so might be that you'd precommitted to doing so (or some UDT equivalent), so as to secure my cooperation. Of course, if you can secure my cooperation without such a precommitment (say, by claiming you would point it at both of us), that's even better.
Complicated or ambiguous schemes take more time to explain, get more attention, and risk folks spending time trying to gerrymander their way in instead of contributing to FAI.
I think any solution other than "enfranchise humanity" is a potential PR disaster.
Keep in mind that not everyone is that smart, and there are some folks who would make a fuss about disenfranchisement of others even if they themselves were enfranchised (and therefore, by definition, those they were making a fuss about would be enfranchised if they thought it was a good idea)....
This is for anyone in the LessWrong community who has made at least some effort to read the sequences and follow along, but is still confused on some point, and is perhaps feeling a bit embarrassed. Here, newbies and not-so-newbies are free to ask very basic but still relevant questions with the understanding that the answers are probably somewhere in the sequences. Similarly, LessWrong tends to presume a rather high threshold for understanding science and technology. Relevant questions in those areas are welcome as well. Anyone who chooses to respond should respectfully guide the questioner to a helpful resource, and questioners should be appropriately grateful. Good faith should be presumed on both sides, unless and until it is shown to be absent. If a questioner is not sure whether a question is relevant, ask it, and also ask if it's relevant.