John_Maxwell_IV comments on Stupid Questions Open Thread - Less Wrong

42 Post author: Costanza 29 December 2011 11:23PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (265)

You are viewing a single comment's thread. Show more comments above.

Comment author: MileyCyrus 30 December 2011 12:35:49AM 14 points [-]

If the SIAI engineers figure out how to construct friendly super-AI, why would they care about making it respect the values of anyone but themselves? What incentive do they have to program an AI that is friendly to humanity, and not just to themselves? What's stopping LukeProg from appointing himself king of the universe?

Comment author: John_Maxwell_IV 26 March 2012 12:25:01AM 0 points [-]

Now that I understand your question better, here's my answer:

Let's say the engineers decide to make the AI respect only their values. But if they were the sort of people who were likely to do that, no one would donate money to them. They could offer to make the AI respect the values of themselves and their donors, but that would alienate everyone else and make the lives of themselves and their donors difficult. The species boundary between humans and other living beings is a natural place to stop expanding the circle of enfranchised agents.

Comment author: TheOtherDave 26 March 2012 12:51:42AM 0 points [-]

This seems to depend on the implicit assumption that their donors (and everyone else powerful enough to make their lives difficult) don't mind having the values of third parties respected.

If some do mind, then there's probably some optimally pragmatic balancing point short of all humans.

Comment author: John_Maxwell_IV 26 March 2012 01:37:06AM 0 points [-]

Probably, but defining that balancing point would mean a lot of bureaucratic overhead to determine who to exclude or include.

Comment author: TheOtherDave 26 March 2012 03:19:22AM 0 points [-]

Can you expand on what you mean by "bureaucratic" here?

Comment author: John_Maxwell_IV 26 March 2012 03:32:31AM 0 points [-]

Are people going to vote on whether someone should be included? Is there an appeals process? Are all decisions final?

Comment author: TheOtherDave 26 March 2012 01:01:00PM 1 point [-]

OK, thanks.

It seems to me all these questions arise for "include everyone" as well. Somewhere along the line someone is going to suggest "don't include fundamentalist Christians", for example, and if I'm committed to the kind of democratic decision process you imply, then we now need to have a vote, or at least decide whether we have a vote, etc. etc, all of that bureaucratic overhead.

Of course, that might not be necessary; I could just unilaterally override that suggestion, mandate "No, we include everyone!", and if I have enough clout to make that stick, then it sticks, with no bureaucratic overhead. Yay! This seems to more or less be what you have in mind.

It's just that the same goes for "Include everyone except fundamentalist Christians."

In any case, I don't see how any of this cumbersome democratic machinery makes any sense in this scenario. Actually working out CEV implies the existence of something, call it X, that is capable of extrapolating a coherent volition from the state of a group of minds. What's the point of voting, appeals, etc. when that technology is available? X itself is a better solution to the same problem.

Which implies that it's possible to identify a smaller group of minds as the Advisory Board and say to X "Work out the Advisory Board's CEV with respect to whose minds should be included as input to a general-purpose optimizer's target definition, then work out the CEV of those minds with respect to the desired state of the world."
Then anyone with enough political clout to get in my way, I add to the Advisory Board, thereby ensuring that their values get taken into consideration (including their values regarding whose values get included).

That includes folks who think everyone should get an equal say, folks who think that every human should get an equal say, folks who think that everyone with more than a certain threshold level of intelligence and moral capacity get a say, folks who think that everyone who agrees with them get a say, etc., etc. X works all of that out, and spits out a spec on the other side for who actually gets a say and to what degree, which it then takes as input to the actual CEV-extrapolating process.

This seems kind of absurd to me, but no more so than the idea that X can work out humanity's CEV at all. If I'm granting that premise for the sake of argument, everything else seems to follow.

Comment author: John_Maxwell_IV 26 March 2012 06:31:56PM 1 point [-]

It's just that the same goes for "Include everyone except fundamentalist Christians."

There is no clear bright line determining who is or is not a fundamentalist Christian. Right now, there pretty much is a clear bright line determining who is or is not human. And that clear bright line encompasses everyone we would possibly want to cooperate with.

Your advisory board suggestion ignores the fact that we have to be able to cooperate prior to the invention of CEV deducers.

And you're not describing a process for how the advisory board is decided either. Different advisory boards may produce different groups of enfranchised minds. So your suggestion doesn't resolve the problem.

In fact, I don't see how putting a group of minds on the advisory board is any different than just making them the input to the CEV. If a person's CEV is that someone's mind should contribute to the optimizer's target, that will be their CEV regardless of whether it's measured in an advisory board context or not.

Comment author: TheOtherDave 26 March 2012 11:02:11PM *  3 points [-]

There is no clear bright line determining who is or is not a fundamentalist Christian.

There is no clear bright line determining what is or isn't a clear bright line.

I agree that the line separating "human" from "non-human" is much clearer and brighter than that separating "fundamentalist Christian" from "non-fundamentalist Christian", and I further agree that for minds like mine the difference between those two lines is very important. Something with a mind like mine can work with the first distinction much more easily than with the second.

So what?

A mind like mine doesn't stand a chance of extrapolating a coherent volition from the contents of a group of target minds. Whatever X is, it isn't a mind like mine.

If we don't have such an X available, then it doesn't matter what defining characteristic we use to determine the target group for CEV extrapolation, because we can't extrapolate CEV from them anyway.

If we do have such an X available, then it doesn't matter what lines are clear and bright enough for minds like mine to reliably work with; what matters is what lines are clear and bright enough for systems like X to reliably work with.

Right now, there pretty much is a clear bright line determining who is or is not human. And that clear bright line encompasses everyone we would possibly want to cooperate with.

I have confidence < .1 that either one of us can articulate a specification determining who is human that doesn't either include or exclude some system that someone included in that specification would contest the inclusion/exclusion of.

I also have confidence < .1 that, using any definition of "human" you care to specify, the universe contains no nonhuman systems I would possibly want to cooperate with.

Your advisory board suggestion ignores the fact that we have to be able to cooperate prior to the invention of CEV deducers.

Sure, but so does your "include all humans" suggestion. We're both assuming that there's some way the AI-development team can convincingly commit to a policy P such that other people's decisions to cooperate will plausibly be based on the belief that P will actually be implemented when the time comes; we are neither of us specifying how that is actually supposed to work. Merely saying "I'll include all of humanity" isn't good enough to ensure cooperation if nobody believes me.

I have confidence that, given a mechanism for getting from someone saying "I'll include all of humanity" to everyone cooperating, I can work out a way to use the same mechanism to get from someone saying "I'll include the Advisory Board, which includes anyone with enough power that I care whether they cooperate or not" to everyone I care about cooperating.

And you're not describing a process for how the advisory board is decided either.

I said: "Then anyone with enough political clout to get in my way, I add to the Advisory Board." That seems to me as well-defined a process as "I decide to include every human being."

Different advisory boards may produce different groups of enfranchised minds.

Certainly.

So your suggestion doesn't resolve the problem.

Can you say again which problem you're referring to here? I've lost track.

In fact, I don't see how putting a group of minds on the advisory board is any different than just making them the input to the CEV.

Absolutely agreed.

Consider the implications of that, though.

Suppose you have a CEV-extractor and we're the only two people in the world, just for simplicity.
You can either point the CEV-extractor at yourself, or at both of us.
If you genuinely want me included, then it doesn't matter which you choose; the result will be the same.
Conversely, if the result is different, that's evidence that you don't genuinely want me included, even if you think you do.

Knowing that, why would you choose to point the CEV-extractor at both of us?

One reason for doing so might be that you'd precommitted to doing so (or some UDT equivalent), so as to secure my cooperation. Of course, if you can secure my cooperation without such a precommitment (say, by claiming you would point it at both of us), that's even better.

Comment author: John_Maxwell_IV 27 March 2012 12:44:57AM *  1 point [-]

Complicated or ambiguous schemes take more time to explain, get more attention, and risk folks spending time trying to gerrymander their way in instead of contributing to FAI.

I think any solution other than "enfranchise humanity" is a potential PR disaster.

Keep in mind that not everyone is that smart, and there are some folks who would make a fuss about disenfranchisement of others even if they themselves were enfranchised (and therefore, by definition, those they were making a fuss about would be enfranchised if they thought it was a good idea).

I agree there are potential ambiguity problems with drawing the line at humans, but I think the potential problems are bigger with other schemes.

Sure, but so does your "include all humans" suggestion. We're both assuming that there's some way the AI-development team can convincingly commit to a policy P such that other people's decisions to cooperate will plausibly be based on the belief that P will actually be implemented when the time comes; we are neither of us specifying how that is actually supposed to work. Merely saying "I'll include all of humanity" isn't good enough to ensure cooperation if nobody believes me.

I agree there are potential problems with credibility, but that seems like a separate argument.

I have confidence that, given a mechanism for getting from someone saying "I'll include all of humanity" to everyone cooperating, I can work out a way to use the same mechanism to get from someone saying "I'll include the Advisory Board, which includes anyone with enough power that I care whether they cooperate or not" to everyone I care about cooperating.

It's not all or nothing. The more inclusive the enfranchisement, the more cooperation there will be in general.

I said: "Then anyone with enough political clout to get in my way, I add to the Advisory Board." That seems to me as well-defined a process as "I decide to include every human being."

With that scheme, you're incentivizing folks to prove they have enough political clout to get in your way.

Moreover, humans aren't perfect reasoning systems. Your way of determining enfranchisement sounds a lot more adversarial than mine, which would affect the tone of the effort in a big and undesirable way.

Why do you think that the right to vote in democratic countries is as clearly determined as it is? Restricting voting rights to those of a certain IQ or higher would be a politically unfeasible PR nightmare.

One reason for doing so might be that you'd precommitted to doing so (or some UDT equivalent), so as to secure my cooperation. Of course, if you can secure my cooperation without such a precommitment (say, by claiming you would point it at both of us), that's even better.

Again, this is a different argument about why people cooperate instead of defect. To a large degree, evolution hardwired us to cooperate, especially when others are trying to cooperate with us.

I agree that if the FAI project seems to be staffed with a lot of untrustworthy, selfish backstabbers, we should cast a suspicious eye on it regardless of what they say about their project.

Ultimately it probably doesn't matter much what their broadcasted intention towards the enfranchisement of those outside their group is, since things will largely come down to what their actual intentions are.

Comment author: kodos96 01 January 2013 12:32:35AM 1 point [-]

There is no clear bright line determining who is or is not a fundamentalist Christian. Right now, there pretty much is a clear bright line determining who is or is not human.

Is there? What about unborn babies? What about IVF fetuses? People in comas? Cryo-presevered bodies? Sufficiently-detailed brain scans?