Comment author: cousin_it 21 May 2012 11:39:16AM *  1 point [-]

What's the general algorithm you can use to determine if something like "sound" is a "word" or a "concept"?

Comment author: gRR 21 May 2012 12:29:00PM 0 points [-]

If it extrapolates coherently, then it's a single concept, otherwise it's a mixture :)

This may actually be doable, even at present level of technology. You gather a huge text corpus, find the contexts where the word "sound" appears, do the clustering using some word co-occurence metric. The result is a list of different meanings of "sound", and a mapping from each mention to the specific meaning. You can also do this simultaneously for many words together, then it is a global optimization problem.

Of course, AGI would be able to do this at a deeper level than this trivial syntactic one.

Comment author: cousin_it 21 May 2012 09:58:36AM *  8 points [-]

Yeah, it's weird that Eliezer's metaethics and FAI seem to rely on figuring out "true meanings" of certain words, when Eliezer also wrote a whole sequence explaining that words don't have "true meanings".

For example, Eliezer's metaethical approach (if it worked) could be used to actually answer questions like "if a tree falls in the forest and no one's there, does it make a sound?", not just declare them meaningless :-) Namely, it would say that "sound" is not a confused jumble of "vibrations of air" and "auditory experiences", but a coherent concept that you can extrapolate by examining lots of human brains. Funny I didn't notice this tension until now.

Comment author: gRR 21 May 2012 10:27:53AM 0 points [-]

Does is rely on true meanings of words, particularly? Why not on concepts? Individually, "vibrations of air" and "auditory experiences" can be coherent.

Comment author: TheOtherDave 20 May 2012 07:08:25PM 0 points [-]

There probably exists (hypothetically) some plan such that it wouldn't seem unreasonable to me to declare anyone who doesn't endorse that plan either insufficiently well-informed or insufficiently intelligent.

In fact, there probably exist several such plans, many of which would have results I would subsequently regret, and some of which do not.

Comment author: gRR 20 May 2012 07:43:56PM 0 points [-]

I think seeking and refining such plans would be a worthy goal. For one thing, it would make LW discussions more constructive. Currently, as far as I can tell, CEV is very broadly defined, and its critics usually point at some feature and cast (legitimate) doubt on it. Very soon, CEV is apparently full of holes and one may wonder why is it not thrown away already. But they may be not real holes, just places where we do not know enough yet. If these points are identified and stated in a form of questions of fact, which can be answered by future research, then a global plan, in the form of a decision tree, could be made and reasoned about. That would be a definite progress, I think.

Comment author: TheOtherDave 20 May 2012 05:36:55PM 0 points [-]

No, I wouldn't expect running it to be uncontroversial, but I would endorse running it.

I can't imagine any world-changing event that would be uncontroversial, if I assume that the normal mechanisms for generating controversy aren't manipulated (in which case anything might be uncontroversial).

Why is it important that it be uncontroversial?

Comment author: gRR 20 May 2012 05:56:51PM 0 points [-]

Why is it important that it be uncontroversial?

I'm not sure. But it seems a useful property to have for an AI being developed. It might allow centralizing the development. Or something.

Ok, you're right in that a complete lack of controversy is impossible, because there are always trolls, cranks, conspiracy theorists, etc. But is it possible to reach a consensus among all sufficiently well-informed sufficiently intelligent people? Where "sufficiently" is not a too high threshold?

Comment author: TheOtherDave 20 May 2012 04:07:27PM 0 points [-]

I don't know how to answer that question. Again, it seems that you're trying to get an answer given a whole bunch of assumptions, but that you resist the effort to make those assumptions clear as part of the answer.

  • It is not clear to me that there exists such a thing as a "unanimous CEV" at all, even in the hypothetical sense of something we might be able to articulate some day with the right tools.

  • If I nevertheless assume that a unanimous CEV exists in that hypothetical sense, it is not clear to me that only one exists; presumably modifications to the CEV-extraction algorithm would result in different CEVs from the same input minds, and I don't see any principled grounds for choosing among that cohort of algorithms that don't in effect involve selecting a desired output first. (In which case CEV extraction is a complete red herring, since the output was a "bottom line" written in advance of CEV's extraction, and we should be asking how that output was actually arrived at and whether we endorse that process. )

  • If I nevertheless assume that a single CEV-extraction algorithm is superior to all the others, and further assume that we select that algorithm via some process I cannot currently imagine and run it, and that we then run a superhuman environment-optimizer with its output as a target, it is not clear to me that I would endorse that state change as an individual. So, no, I don't agree that running it is uncontroversial. (Although everyone might agree afterwards that it was a good idea.)

  • If the state change nevertheless gets implemented, I agree (given all of those assumptions) that the resulting state-change improves the world by the standards of all humanity. "Safe" is an OK word for that, I guess, though it's not the usual meaning of "safe."

  • I don't agree that the worst that happens, if those assumptions turn out to be wrong, is that it stands there and does nothing. The worst that happens is that the superhuman environment-optimizer runs with a target that makes the world worse by the standards of all humanity.

(Yes, I understand that the CEV-extraction algorithm is supposed to prevent that, and I've agreed that if I assume that's true, then this doesn't happen. But now you're asking me to consider what happens if the "hypothesis" is false, so I am no longer just assuming that's true. You're putting a lot of faith in a mysterious extraction algorithm, and it is not clear to me that a non-mysterious algorithm that satisfies that faith is likely, or that the process of coming up with one won't come up with a different algorithm that antisatisfies that faith instead.)

Comment author: gRR 20 May 2012 05:25:19PM 0 points [-]

What I'm trying to do is find some way to fix the goalposts. Find a set of conditions on CEV that would satisfy. Whether such CEV actually exists and how to build it are questions for later. Lets just pile up constraints until a sufficient set is reached. So, lets assume that:

  • "Unanimous" CEV exists
  • And is unique
  • And is definable via some easy, obviously correct, and unique process, to be discovered in the future,
  • And it basically does what I want it to do (fulfil universal wishes of people, minimize interference otherwise),

would you say that running it is uncontroversial? If not, what other conditions are required?

Comment author: Jack 20 May 2012 06:05:13AM *  0 points [-]

How is it worse for you directly?

Comment author: gRR 20 May 2012 10:45:43AM 1 point [-]

I value the universe with my friend in it more than one without her.

Comment author: TheOtherDave 20 May 2012 04:15:27AM *  1 point [-]

OK, cool.

To answer your question: sure, if I assume (as you seem to) that the extrapolation process is such that I would in fact endorse the results, and I also assume that the extrapolation process is such that if it takes as input all humans it will produce at least one desire that is endorsed by all humans (even if they themselves don't know it in their current form), then I'd agree that's a good plan, if I further assume that it doesn't have any negative side-effects.

But the assumptions strike me as implausible, and that matters.

I mean, if I assume that everyone being thrown into a sufficiently properly designed blender and turned into stew is a process I would endorse, and I also assume that the blending process has no negative side-effects, then I'd agree that that's a good plan, too. I just don't think any such blender is ever going to exist.

Comment author: gRR 20 May 2012 10:44:33AM 1 point [-]

Ok, but do you grant that running a FAI with "unanimous CEV" is at least (1) safe, and (2) uncontroversial? That the worst problem with it is that it may just stand there doing nothing - if I'm wrong about my hypothesis?

Comment author: DanArmak 19 May 2012 06:09:10PM 0 points [-]

This is completely wrong. People are happy, by definition, if their actual values are fulfilled; not if some conflicting extrapolated values are fulfilled. CEV was supposed to get around this by proposing (without saying how) that people would actually grow to become smarter etc. and thereby modify their actual values to match the extrapolated ones, and then they'd be happy in a universe optimized for the extrapolated (now actual) values. But you say you don't want to change other people's values to match the extrapolation. That makes CEV a very bad idea - most people will be miserable, probably including you!

Comment author: gRR 20 May 2012 01:50:24AM 1 point [-]

People are happy, by definition, if their actual values are fulfilled

Yes, but values depend on knowledge. There was an example by EY, I forgot where, in which someone values a blue box because they think the blue box contains a diamond. But if they're wrong, and it's actually the red box that contains the diamond, then what would actually make them happy - giving them the blue or the red box? And would you say giving them the red box is making them suffer?

Well, perhaps yes. Therefore, a good extrapolated wish would include constraints on the speed of its own fulfillment: allow the person to take the blue box, then convince them that it is the red box they actually want, and only then present it. But in cases where this is impossible (example: blue box contains horrible violent death), then it is wrong to say that following the extrapolated values (withholding the blue box) is making the person suffer. Following their extrapolated values is the only way to allow them to have a happy life.

Comment author: Desrtopa 19 May 2012 10:36:06PM 0 points [-]

I believe there exist (extrapolated) wishes universal for humans (meaning, true for literally everyone). Among these wishes, I think there is the wish for humans to continue existing.

Does the existence of the Voluntary Human Extinction Movement affect your belief in this proposition?

Comment author: gRR 19 May 2012 10:40:15PM 0 points [-]

VHEMT supports human extinction primarily because, in the group's view, it would prevent environmental degradation. The group states that a decrease in the human population would prevent a significant amount of man-made human suffering.

Obviously, human extinction is not their terminal value.

Comment author: TheOtherDave 19 May 2012 09:39:25PM 0 points [-]

You've lost me. Can you restate the question of simple fact to which you refer here, which you believe is true? Can you restate the plan that you consider good if that question is true?

Comment author: gRR 19 May 2012 10:33:10PM 0 points [-]

I believe there exist (extrapolated) wishes universal for humans (meaning, true for literally everyone). Among these wishes, I think there is the wish for humans to continue existing. I would like for AI to fulfil this wish (and other universal wishes if there are any), while letting people decide everything else for themselves.

View more: Prev | Next