TheOtherDave comments on Holden's Objection 1: Friendliness is dangerous - Less Wrong

11 Post author: PhilGoetz 18 May 2012 12:48AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (428)

You are viewing a single comment's thread. Show more comments above.

Comment author: TheOtherDave 26 May 2012 08:37:52PM 0 points [-]

Perfect knowledge is wonderful, sure, but was not key to my point.

Given two processes for making some decision, if process P1 is more reliable than process P2, then P1 will get me better results. That's true even if P1 is imperfect. That's true even if P2 is "ask my own brain and do what it tells me." All that is required is that P1 is more reliable than P2.

It follows that when choosing between two processes to implement my values, if I can ask one question, I should ask which process is more reliable. I should not ask which process is perfect, nor ask which process resides in my brain.

ETA: I endorse providing expert opinion, even though that deprives people of the experience of figuring it all out for themselves... agreed that far. But I also endorse providing reliable infrastructure, even though that deprives people of the experience of building all the infrastructure themselves, and I endorse implementing reliable decision matrices, even though that deprives people of the experience of making all the decisions themselves.

Comment author: DanArmak 27 May 2012 01:21:08PM 0 points [-]

There's no reason you have to choose just once, a single process to answer all kinds of questions. Different processes better fit different domains. Expert opinion best fits well-understood, factual, objective, non-politicized, amoral questions. Noninterference best fits matters where people are likely to want to interfere in others' decisions and there is no pre-CEV consensus on whether such intervention is permissible.

The problem with making decisions for others isn't that it deprives them of the experience of making decisions, but that it can influence or force them into decisions that are wrong in some sense of the word.

Comment author: TheOtherDave 27 May 2012 02:33:00PM 0 points [-]

(shrug) Letting others make decisions for themselves can also influence or force them into decisions that are wrong in some sense of the word. If that's really the problem, then letting people make their own decisions doesn't solve it. The solution to that problem is letting whatever process is best at avoiding wrong answers make the decision.

And, sure, there might be different processes for different questions. But there's no a priori reason to believe that any of those processes reside in my brain.

Comment author: DanArmak 27 May 2012 03:28:20PM 0 points [-]

Letting others make decisions for themselves can also influence or force them into decisions that are wrong in some sense of the word.

True. Nonintervention only works if you care about it more than about anything people might do due to it. Which is why a system of constraints that is given to the AI and is not CEV-derived can't be just nonintervention, it has to include other principles as well and be a complete ethical system.

And, sure, there might be different processes for different questions. But there's no a priori reason to believe that any of those processes reside in my brain.

I'm always open to suggestions of new processes. I just don't like the specific process of CEV, which happens not to reside in my brain, but that's not why I dislike it.

Comment author: TheOtherDave 27 May 2012 03:46:50PM 3 points [-]

Ah, OK.

At the beginning of this thread you seemed to be saying that your current preferences (which are, of course, the product of a computation that resides in your brain) were the best determiner of what to optimize the environment for. If you aren't saying that, but merely saying that there's something specific about CEV that makes it an even worse choice, well, OK. I mean, I'm puzzled by that simply because there doesn't seem to be anything specific about CEV that one could object to in that way, but I don't have much to say about that; it was the idea that the output of your current algorithms are somehow more reliable than the output of some other set of algorithms implemented on a different substrate that I was challenging.

Sounds like a good place to end this thread, then.

Comment author: wedrifid 27 May 2012 04:31:38PM 2 points [-]

I'm puzzled by that simply because there doesn't seem to be anything specific about CEV that one could object to in that way

Really? What about the "some people are Jerks" objection? That's kind of a big deal. We even got Eliezer to tentatively acknowledge the theoretical possibility that that could be objectionable at one point.

Comment author: TheOtherDave 27 May 2012 05:11:55PM 0 points [-]

(nods) Yeah, I was sloppy. I was referring to the mechanism for extrapolating a coherent volition from a given target, rather than the specification of the target (e.g., "all of humanity") or other aspects of the CEV proposal, but I wasn't at all clear about that. Point taken, and agreed that there are some aspects of the proposal (e.g. target specification) that are specific enough to object to.

Tangentially, I consider the "some people are jerks" objection very confused. But then, I mostly conclude that if such a mechanism can exist at all, the properties of people are about as relevant to its output as the properties of states or political parties. More thoughts along those lines here.

Comment author: wedrifid 27 May 2012 08:04:04PM 1 point [-]

I was referring to the mechanism for extrapolating a coherent volition from a given target

It really is hard to find a fault with that part!

Tangentially, I consider the "some people are jerks" objection very confused.

I don't understand. If the CEV of a group that consists of yourself and ten agents with values that differ irreconcilably from yours then we can expect that CEV to be fairly abhorrent to you. That is, roughly speaking, a risk you take when you substitute your own preferences for preferences calculated off a group that you don't don't fully understand or have strong reason to trust.

That CEV would also be strictly inferior to CEV<Just Dave> which would implicitly incorporate the extrapolated preferences of the other ten agents to precisely the degree that you would it to do so.

Comment author: TheOtherDave 27 May 2012 09:58:29PM *  0 points [-]

I agree that if there exists a group G of agents A1..An with irreconcilably heterogenous values, a given agent A should strictly prefer CEV(A) to CEV(G). If Dave is an agent in this model, then Dave should prefer CEV(Dave) to CEV(group), for the reasons you suggest. Absolutely agreed.

What I question is the assumption that in this model Dave is better represented as an agent and not a group. In fact, I find that assumption unlikely, as I noted above. (Ditto wedrifid, or any other person.)

If Dave is a group, then CEV(Dave) is potentially problematic for the same reason that CEV(group) is problematic... every agent composing Dave should prefer that most of Dave not be included in the target definition. Indeed, if group contains Dave and Dave contains an agent A1, it isn't even clear that A1 should prefer CEV(Dave) to CEV(group)... while CEV(Dave) cannot be more heterogenous than CEV(group), it might turn out that a larger fraction (by whatever measure the volition-extrapolator cares about) of group supports A1's values than the fraction of Dave that supports them.

If the above describes the actual situation, then whether Dave is a jerk or not (or wedrifid is, or whoever) is no more relevant to the output of the volition-extrapolation mechanism than whether New Jersey is a jerk, or whether the Green Party is a jerk... all of these entities are just more-or-less-transient aggregates of agents, and the proper level of analysis is the agent.

Comment author: wedrifid 27 May 2012 10:18:52PM 0 points [-]

Approximately agree.

If Dave is a group, then CEV(Dave) is potentially problematic for the same reason that CEV(group) is problematic... every agent composing Dave should prefer that most of Dave not be included in the target definition. Indeed, if group contains Dave and Dave contains an agent A1, it isn't even clear that A1 should prefer CEV(Dave) to CEV(group)... while CEV(Dave) cannot be more heterogenous than CEV(group), it might turn out that a larger fraction (by whatever measure the volition-extrapolator cares about) of group supports A1's values than the fraction of Dave that supports them.

This is related to why I'm a bit uncomfortable accepting the sometimes expressed assertion "CEV only applies to a group, if you are doing it to an individual it's just Extrapolated Volition". The "make it stop being incoherent!" part applies just as much to conflicting and inconsistent values within a messily implemented individual as it does to differences between people.

If the above describes the actual situation, then whether Dave is a jerk or not (or wedrifid is, or whoever) is no more relevant to the output of the volition-extrapolation mechanism than whether New Jersey is a jerk, or whether the Green Party is a jerk... all of these entities are just more-or-less-transient aggregates of agents, and the proper level of analysis is the agent.

Taking this "it's all agents and subagents and meta-agents" outlook the remaining difference is one of arbitration. That is, speaking as wedrifid I have already implicitly decided which elements of the lump of matter sitting on this chair are endorsed as 'me' and so included in the gold standard (CEV<wedrifid>). While it may be the case that my amygdala can be considered an agent that is more similar to your amygdala than to the values I represent in abstract ideals, adding the amygdala-agent of another constitutes corrupting the CEV with some discrete measure of "Jerkiness".

Comment author: DanArmak 27 May 2012 03:55:16PM 0 points [-]

Yes, I was talking about shortcomings of CEV, and did not mean to imply that my current preferences were better than any third option. They aren't even strictly better than CEV; I just think they are better overall if I can't mix the two.