Will_Newsome comments on David Chalmers' "The Singularity: A Philosophical Analysis" - Less Wrong

33 Post author: lukeprog 29 January 2011 02:52AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (202)

You are viewing a single comment's thread.

Comment author: Will_Newsome 29 January 2011 03:05:48AM *  3 points [-]

Most of this assumes that values are independent of intelligence, as Hume argued. But if Hume was wrong and Kant was right, then we will be less able to constrain the values of a superintelligent machine, but the more rational the machine is, the better values it will have.

Are there any LW-rationalist-vetted philosophical papers on this theme in modern times? (I'm somewhat skeptical of the idea that there isn't a universal morality (relative to some generalized Occamian prior-like-thing) that even a paperclip maximizer would converge to (if it was given the right decision theoretic (not necessarily moral per se) tools for philosophical reasoning, which is by no means guaranteed, so we should of course still be careful when designing AGIs).)

Comment author: JGWeissman 29 January 2011 03:16:18AM 13 points [-]

How would converging to a "universal morality" help produce paperclips?

Comment author: Perplexed 29 January 2011 03:38:45PM *  3 points [-]

Are there any LW-rationalist-vetted philosophical papers on this theme in modern times?

I'm not sure what is required for a philosophical paper to be deemed "LW-rationalist-vetted", nor am I sure why that is a desirable feature for a paper to have. But I will state that, IMHO, an approach based on "naturalistic ethics", like that of Binmore is at least as rational as any ethical approach based on some kind of utilitarianism.

I would say that a naturalistic approach to ethics assumes, with Hume, that fundamental values are not universal - they may certainly vary by species, for example, and also by the historical accidents of genetics, birth-culture, etc. However, meta-ethics is rationally based and universal, and can be converged upon by a process of reflective equilibrium.

As to instrumental values - those turn out to be universal in the sense that (in the limit of perfect rationality and low-cost communication) they will be the same for everyone in the ethical community at a given time. However, they will not be universal in the sense that they will be the same for all conceivable communities in the multiverse. Instrumental values will depend on the makeup of the community, because the common community values are derived as a kind of compromise among the idiosyncratic fundamental values of the community members. Instrumental values will also depend upon the community's beliefs - regarding expected consequences of actions, expected utilities of outcomes, and even regarding the expected future composition of the community. And, since the community learns (i.e. changes its beliefs), instrumental values must inevitably change a little with time.

I'm somewhat skeptical of the idea that there isn't a universal morality (relative to some generalized Occamian prior-like-thing) that even a paperclip maximizer would converge to ...

As an intuition pump, I'll claim that Clippy could fit right in to a community of mostly human rationalists, all in agreement on the naturalist meta-ethics. In that community, Clippy would act in accordance with the community's instrumental values (which will include both the manufacture of paperclips and other, more idiosyncratically human values). Clippy will know that more paper clips are produced by the community than Clippy could produce on his own if he were not a community member. And the community welcomes Clippy, because he contributes to the satisfaction of the fundamental values of other community members - through his command of metallurgy and mechanical engineering, for example.

The aspect of naturalistic ethics which many people find distasteful is that the community will contribute to the satisfaction of your fundamental values only to the extent that you contribute to the satisfaction of the fundamental values of other community members. So, the fundamental values of the weak and powerless tend to get less weight in the collective instrumental value system than do the fundamental values of the strong and powerful. Of course, this does not mean that the very young and the elderly get mistreated - it is rational to contribute now to those who have contributed in the past or who will contribute in the future. And many humans will include concern for the weak among their fundamental values - so the community will have to respect those values.

Comment author: Jack 29 January 2011 04:56:51AM 6 points [-]

Since it keeps coming up I think I'll write a top level post on the subject- I'll probably do some research when writing so I'll see what has been written recently. Hopefully I'll publish in the next week or two.

Comment author: wedrifid 29 January 2011 03:35:32PM 4 points [-]

But, but... paperclips. Its morality is 'make more flipping paperclips'! Just that. With the right decision theoretic tools for philosophical reasoning it will make even more paper-clips. If that even qualifies as 'morality' then that is what a paperclip maximiser has.

Comment author: ArisKatsaris 29 January 2011 03:59:49PM *  3 points [-]

Look, I personally don't believe that all or even most moralities will converge, however... imagine something like the following:

Dear paperclipper,

There's a limited amount of matter that's reachable by you in the known universe for any given timespan. Moreover, your efforts to paperclip the universe will be opposed both by humans and other alien civilizations which will perceive them as hostile and dangerous. Even if you're ultimately victorious, which is far from certain, you're better off cooperating with humans peacefully, postponing slightly your plans to make paperclips (which you'd have to postpone anyway in order to create weaponry to defeat humanity), and instead working with humans to create a feasible way to construct a new universe where you will hence possess and wherein your desire to create an infinite amount of paperclips will be satisfied without opposition.

Sincerely, humanity.


So, from the intrinsic "I want to create as many paperclips as possible" the truly intelligent AI can reasonably discover the instrumental "I'd like to not be opposed to my creation of such paperclips" to "I'd like to create my paperclips in a way that they won't harm others, so that they won't have a reason for me to oppose me" to "I'd like to transport myself to an uninhabited universe of my own creation, to make paperclips without any opposition at all".

This is probably wishful thinking, but the situation isn't as simple as what you describe either.

Comment author: DanArmak 29 January 2011 04:17:08PM 6 points [-]

If the paperclipper happens to be the first AI++, and arrives before humanity goes interstellar, then it can probably wipe out all humanity quite quickly without reasoning with it. And if can do that it definitely will - no point in compromising when you've got the upper hand.

Comment author: wedrifid 29 January 2011 04:34:17PM 5 points [-]

no point in compromising when you've got the upper hand.

Well, at least not when the lower hand is more use disassembled to build more cosmic commons burning spore ships.

Comment author: wedrifid 29 January 2011 04:31:48PM *  2 points [-]

Wanting to maximise paperclips (obviously?) does not preclude cooperation in order to produce paperclips. We haven't redefined 'morality' to include any game theoretic scenarios in which cooperation is reached, have we? (I suppose we could do something along those lines in the theism thread.)

Comment author: TheOtherDave 29 January 2011 04:13:40PM 2 points [-]

Agreed that this is probably wishful thinking.

But, yes, also agreed that a sufficiently intelligent and well-informed paperclipper will work out that diplomacy, including consistent lying about its motives, is a good tactic to use for as long as it doesn't completely overpower its potential enemies.

Comment author: timtyler 29 January 2011 01:28:40PM *  1 point [-]

I'm somewhat skeptical of the idea that there isn't a universal morality (relative to some generalized Occamian prior-like-thing) that even a paperclip maximizer would converge to (if it was given the right decision theoretic (not necessarily moral per se) tools for philosophical reasoning, which is by no means guaranteed, so we should of course still be careful when designing AGIs).

There's goal system zero / God's utility function / Universal Instrumental Values.

Comment author: shokwave 29 January 2011 02:03:28PM 4 points [-]

I'm somewhat skeptical of the idea that there isn't a universal morality that even a paperclip maximizer would converge to

You mean you're somewhat convinced that there is a universal morality (that even a paperclip maximizer would converge to)? That sounds like a much less tenable position. I mean,

There's goal system zero / God's utility function / Universal Instrumental Values.

A statement like this needs some support.

Comment author: timtyler 29 January 2011 04:20:04PM *  4 points [-]

I've linkified the grandparent a bit - for those not familiar with the ideas.

The main idea is that many agents which are serious about attaining their long term goals will first take control of large quantities of spactime and resources - before they do very much else - to avoid low-utility fates like getting eaten by aliens.

Such goals represent something like an attractor in ethics-space. You could avoid the behaviour associated with the attractor by using discounting, or by adding constraints - at the expense of making the long-term goal less likely to be attained.

Comment author: Perplexed 31 January 2011 06:40:26AM 2 points [-]

Thx for this. I found those links and the idea itself fascinating. Does anyone know if Roko or Hollerith developed the idea much further?

One is reminded of the famous quote from 1984: O'Brien to Winston: "Power is not a means. Power is the end." But it certainly makes sense, that as an agent becomes better integrated into a coalition or community, and his day-to-day goals become more weighted toward the terminal values of other people and less weighted toward his own terminal values, that an agent might be led to rewrite his own utility function toward Power - instrumental power to achieve any goal makes sense as a synthetic terminal value.

After all, most of our instinctual terminal values - sexual pleasure, food, good health, social status, the joy of victory and the agony of defeat - were originally instrumental values from the standpoint of their 'author': natural selection.

Comment author: timtyler 31 January 2011 09:30:12PM *  3 points [-]

Does anyone know if Roko or Hollerith developed the idea much further?

Roko combined the conccept with the (rather less sensible) idea of promoting those instrumental values into terminal values - and was met with a chorus of "Unfriendly AI".

Hollerith produced several pages on the topic.

Probably the best-known continuation is via Omohundro.

"Universal Instrumental Values" is much the same idea as "Basic AI drives" dressed up a little differently:

Comment author: Perplexed 31 January 2011 09:45:36PM 1 point [-]

"Universal Instrumental Values" is much the same idea as "Basic AI drives" dressed up a little differently

You are right. I hadn't made that connection. Now I have a little more respect for Omohundro's work.

Comment author: timtyler 31 January 2011 10:40:03PM *  0 points [-]

I was a little bit concerned about your initial Omohundro reaction.

Omohundro's material is mostly fine and interesting. It's a bit of a shame that there isn't more maths - but it is a difficult area where it is tricky to prove things. Plus, IMO, he has the occasional zany idea that takes your brain to interesting places it didn't dream of before.

I maintain some Omohundro links here.

Comment author: jacob_cannell 31 January 2011 09:47:46PM 0 points [-]

As a side point, you could also re-read "Basic AI drives" as "Basic Replicator Drives" - it's systemic evolution.

Comment author: jacob_cannell 31 January 2011 09:53:03PM *  0 points [-]

Interesting, hadn't seen Hollerith's posts before. I came to a similar conclusion about AIXI's behavior as exemplifying a final attractor in intelligent systems with long planning horizons.

If the horizon is long enough (infinite), the single behavioral attractor is maximizing computational power and applying it towards extensive universal simulation/prediction.

This relates to simulism and the SA, as any superintelligences/gods can thus be expected to create many simulated universes, regardless of their final goal evaluation criteria.

In fact, perhaps the final goal criteria applies to creating new universes with the desired properties.

Comment author: shokwave 29 January 2011 06:15:50PM 2 points [-]

These sound instrumental; you take control of the universe in order to achieve your terminal goals. That seems slightly different from what Newsome was talking about, which was more a converging of terminal goals on one superterminal goal.

Comment author: timtyler 29 January 2011 06:20:54PM *  1 point [-]

Thus one the proposed titles: "Universal Instrumental Values".

Newsome didn't distinguish between instrumental and terminal values.

Comment author: Vladimir_Nesov 29 January 2011 02:39:59PM 1 point [-]

You mean you're somewhat convinced that there is a universal morality (that even a paperclip maximizer would converge to)? That sounds like a much less tenable position.

Those were Newsome's words.

Comment author: shokwave 29 January 2011 06:09:28PM 1 point [-]

Ah. I misunderstood the quoting.

Comment author: Vladimir_Nesov 29 January 2011 01:39:32PM *  -1 points [-]

Boo!

(To make a point as well-argued as the one it replies to.)

Edit: Now that the above comment was edited to include citations, my joke stopped being funny and got downvoted.

Comment author: jacob_cannell 29 January 2011 05:09:55AM *  0 points [-]

Any universal morality has to have long term fitness - ie it must somehow win at the end of time.

Otherwise, aliens may have a more universal morality.

EDIT: why the downvote?

Comment author: endoself 29 January 2011 05:33:22AM *  1 point [-]

This does not require as much optimization as it sounds. As Wei Dai points out, computing power is proportional to the square amount of mass obtained as long as that mass can be physically collected together, so a civilization collecting mass probably gets more observers than one spreading out and colonizing mass, depending on the specifics of cosmology. This kind of civilization is much easier to control centrally, so a wide range of values have the potential to dominate, depending on which ones happen to come into being.

Comment author: jacob_cannell 29 January 2011 11:28:49PM 2 points [-]

I'm not sure where he got the math that available energy is proportional to the square of the mass. Wouldn't this come from the mass-energy equivalence and thus be mc^2?

Wei Dai's conjecture about black holes being useful as improved entropy dumps is interesting. Black holes or similar dense entities also maximize speed potential and interconnect efficiency, but they are poor as information storage.

It's also possible that by the time a civilization reaches this point of development, it figures out how to do something more interesting such as create new physical universes. John Smart has some interesting speculation on that and how singularity civilizations may eventually compete/cooperate.

I still have issues wrapping my head around the time dilation.

Comment author: endoself 30 January 2011 05:07:30PM 3 points [-]

Energy is proportional to mass. Computing ability is proportional to (max entropy - current entropy), and max entropy is proportional to the square of mass. That was the whole point of his argument.

Comment author: LucasSloan 29 January 2011 09:55:12PM 1 point [-]

Is this an argument based on the idea that there is some way for all of math to look such that everyone gets as much of what they want as possible?