Perplexed comments on Convergence Theories of Meta-Ethics - Less Wrong

7 Post author: Perplexed 07 February 2011 09:53PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (87)

You are viewing a single comment's thread. Show more comments above.

Comment author: Perplexed 08 February 2011 08:43:30PM *  0 points [-]

... I doubt you're going to convince me I'm wrong, or resolve that confusion, by refusing to talk about morality and only talking about what AIs will do.

I have to apologize. Apparently my writing was extremely unclear. I wasn't refusing to talk about morality. The whole posting was an exploration of some of the properties of the relation "A is at least as good as B" when A and B are normative ethical systems.

Admittedly, I did not spend much time actually making ethical judgments, I was operating at the meta level.

I still have to decide what values to give it initially, because those values will partly determine the outcome of the universe. ...

But the whole point of my posting was that, if there is convergence (in the second sense) then those initial values may make very little difference in the outcome of the universe - that is, they may be important initially, but in the longer term the ethical system that is converged upon depends less on the seed ethics than on issues of how AIs depend upon each other, how they reproduce, etc.

I'm very sorry that you missed this - the main thrust of the posting. If I had written more clearly, your response might have been a more productive disagreement about substance, rather than a complaint about the title.

Comment author: Wei_Dai 08 February 2011 09:18:55PM *  1 point [-]

those initial values may make very little difference in the outcome of the universe

But even if "make very little difference" is true, it's little in a relative sense, in that my initial utility function might just end up having just a billionth percent weight in the final merged AI. But in the absolute sense, the difference could still be huge. For an analogy, suppose it's certain that our civilization will never expand beyond the solar system, which will blow up in a few billion years no matter what. Then our values similarly make very little difference in the outcome of the universe in a relative sense but may still make a huge difference in an absolute sense (e.g. if we create a FOOMing singleton that just takes over the solar system).

Also, if I can figure out what I want, and the answer applies to and convinces many others, that could also make a big difference even in a relative sense.

Comment author: Perplexed 08 February 2011 10:19:17PM 0 points [-]

But even if "make very little difference" is true, it's little in a relative sense ...

The conjecture is that it is true in an absolute sense. It would have made no sense at all for me to even mention it if I had meant it in the relative sense that you set up here as a straw man and then knock down.

There is something odd going on here. Three very intelligent people are interpreting what I write quite differently than the way I intend it. Probably it is because I generated confusion by misusing words that have a fixed meaning here. And, in this case, it may be because you were thinking of our "fragility" conversation rather than the main posting. But, whatever the reason, I'm finding this very frustrating.

Comment author: Wei_Dai 08 February 2011 11:00:42PM *  0 points [-]

I guess I took your conjecture to be the "relative" one because whether or not it is true perhaps doesn't depend on details of one's utility function, and we, or at least I, was talking about whether the question "what do I want?" is an important one. I'm not sure how you hope to show the "absolute" version in the same way.

Comment author: Perplexed 08 February 2011 11:57:02PM 1 point [-]

I'm not sure how you hope to show the "absolute" version in the same way.

Well, Omohundro showed that a certain collection of instrumental values tend to arise independently of the 'seeded' intrinsic values. In fact, decision making tends to be dominated by consideration of these 'convergent' instrumental values, rather than the human-inserted seed values.

Next, consider that those human values themselves originated as heuristic approximations of instrumental values contributing to the intrinsic value of interest to our optimization process - natural selection. The fact that we ended up with the particular heuristics that we did is not due to the fact that the intrinsic value for that process was reproductive success - every species in the biosphere evolved under the guidance of that value. The reason why humans ended up with values like curiosity, reciprocity, and toleration has to do with the environment in which we evolved.

So, my hope is that we can show that AIs will converge to human-like instrumental/heuristic values if they do their self-updating in a human-like evolutionary environment. Regardless of the details of their seeds.

That is the vision, anyways.

Comment author: Wei_Dai 09 February 2011 07:08:09PM *  1 point [-]

I notice that Robin Hanson takes a position similar to yours, in that he thinks things will turn out ok from our perspective if uploads/AIs evolve in an environment defined by certain rules (in his case property laws and such, rather than sexual reproduction).

But I think he also thinks that we do not actually have a choice between such evolution and a FOOMing singleton (i.e. FOOMing singleton is nearly impossible to achieve), whereas you think we might have a choice or at least you're not taking a position on that. Correct me if I'm wrong here.

Anyway, suppose you and Robin are right and we do have some leverage over the environment that future AIs will evolve in, and can use that leverage to predictably influence the eventual outcome. I contend we still have to figure out what we want, so that we know how to apply that leverage. Presumably we can't possibly make the AI evolutionary environment exactly like the human one, but we might have a choice over a range of environments, some more human-like than others. But it's not necessarily true that the most human-like environment leads to the best outcome. (Nor is it even clear what it means for one environment to be more human-like than another.) So, among the possible outcomes we can aim for, we'll still have to decide which ones are better than others, and to do that, we need to know what we want, which involves, at least in part, either figuring out morality is, or showing that it's meaningless or otherwise unrelated to what we want.

Do you disagree on this point?

Comment author: Perplexed 09 February 2011 07:34:12PM *  1 point [-]

But I think [Hanson] also thinks that we do not actually have a choice between such evolution and a FOOMing singleton (i.e. FOOMing singleton is nearly impossible to achieve), whereas you think we might have a choice or at least you're not taking a position on that. Correct me if I'm wrong here.

I tend toward FOOM skepticism, but I don't think it is "nearly impossible". Define a FOOM as a scenario leading in at most 10 years from the first human-level AI to a singleton which has taken effective control over the world's economy. I rate the probability of a FOOM at 40% assuming that almost all AI researchers want a FOOM and at 5% assuming that almost all AI researchers want to prevent a FOOM. I'm under the impression that currently a majority of singularitarians want a FOOM, but I hope that that ratio will fall as the dangers of a FOOMing singleton become more widely known.

I contend we still have to figure out what we want, so that we know how to apply that leverage. ... Do you disagree on this point?

No, I agree. Agree enthusiastically. Though I might change the wording just a bit. Instead of "we still have to figure out what we want", I might have written "we still have to negotiate what we want".

My turn now. Do you disagree with this shift of emphasis from the intellectual to the political?

Comment author: Wei_Dai 09 February 2011 07:52:29PM *  1 point [-]

My turn now. Do you disagree with this shift of emphasis from the intellectual to the political?

I suppose if you already know what you personally want, then your next problem is negotiation. I'm still stuck on the first problem, unfortunately.

ETA: What is your answer to The Lifespan Dilemma, for example?

Comment author: Perplexed 09 February 2011 09:19:39PM 0 points [-]

What is your answer to The Lifespan Dilemma, for example?

I only skimmed that posting, and I failed to find any single question there which you apparently meant for me to answer. But let me invent my own question and answer it.

Suppose I expect to live for 10,000 years. Omega appears and offers me a deal. Omega will extend my lifetime to infinity if I simply agree to submit to torture for 15 minutes immediately - the torture being that I have to actually read that posting of Eliezer's with care.

I would turn down Omega's offer without regret, because I believe in (exponentially) discounting future utilities. Roughly speaking, I count the pleasures and pains that I will encounter next year to be something like 1% less significant than this year. I'm doing the math in my head, but I estimate that this makes my first omega-granted bonus year 10,000 years from now worth about 1/10^42 as much as this year. Or, saying it another way, my first 'natural' 10,000 years is worth about 10^42 times as much as the infinite period of time thereafter. The next fifteen minutes is more valuable than that infinite period of time. And I don't want to waste that 15 minutes re-reading that posting.

And I am quite sure that 99% of mankind would agree with me that 1% discounting per year is not an excessive discount rate. That is, in large part, why I think negotiation is important. It is because typical SIAI thinking about morality is completely unacceptable to most of mankind and SIAI seem to be in denial about it.

Comment author: Wei_Dai 09 February 2011 10:31:32PM *  4 points [-]

Have your thought through all of the implications of a 1% discount rate? For example, have you considered that if you negotiate with someone who discounts the future less, say at 0.1% per year, you'll end up trading the use of all of your resources after X number of years in exchange for use of his resources before X number of years, and so almost the entire future of the universe will be determined by the values of those whose discount rates are lower than yours?

If that doesn't bother you, and you're really pretty sure you want a 1% discount rate, do you not have other areas where you don't know what you want?

For example, what exactly is the nature of pleasure and pain? I don't want people to torture simulated humans, but what if they claim that the simulated humans have been subtly modified so that they only look like they're feeling pain, but aren't really? How can I tell if some computation is having pain or pleasure?

And here's a related example: Presumably having one kilogram of orgasmium in the universe is better than having none (all else equal) but you probably don't want to tile the universe with it. Exactly how much worse is a second kilogram of the stuff compared to the first? (If you don't care about orgasmium in the abstract, suppose that it's a copy of your brain experiencing some ridiculously high amount of pleasure.)

Have you already worked out all such problems, or at least know the principles by which you'll figure them out?

Comment author: TheOtherDave 09 February 2011 10:05:57PM 0 points [-]

And I am quite sure that 99% of mankind would agree with me that 1% discounting per year is not an excessive discount rate.

I suspect that 99% of mankind would give different answers to that question, depending on whether it's framed as giving up X now in exchange for receiving Y N years from now, or X N years ago for Y now.

Comment author: Perplexed 09 February 2011 08:23:49PM 0 points [-]

It might make an interesting rationality exercise to have 6-10 people conduct some kind of discussion/negotiation/joint-decision-making-exercise to flesh-out their intuitions as to the type of post-singularity society they would like to live in.

My intuition is that, even if you are not sure what you want, the interactive process will probably help you to clarify exactly what you do not want, and thus assist in both personal and collective understanding of values.

It might be even more interesting to have two or more such 'negotiations' proceeding simultaneously, and then compare results.

Comment author: wedrifid 10 February 2011 11:07:12AM 0 points [-]

It might make an interesting rationality exercise to have 6-10 people conduct some kind of discussion/negotiation/joint-decision-making-exercise to flesh-out their intuitions as to the type of post-singularity society they would like to live in.

Sign me up for 100 years with the catgirls in my volcano lair.

More generally I (strongly) prefer a situation in which the available neg-entropy is distributed, for the owners to do with as they please (with limits). That moves negotiations to be of the 'trade' kind rather than the 'politics' kind. Almost always preferable.

Comment author: TheOtherDave 09 February 2011 08:52:40PM 0 points [-]

I'd be willing to participate in such an exercise.

Comment author: timtyler 10 February 2011 10:45:39PM *  0 points [-]

I tend toward FOOM skepticism, but I don't think it is "nearly impossible". Define a FOOM as a scenario leading in at most 10 years from the first human-level AI to a singleton which has taken effective control over the world's economy.

Automating investing has been going fairly well. For me, it wouldn't be very surprising if we get a dominant, largely machine-operated hedge fund, that "has taken effective control over the world's economy" before we get human-level machine intelligence.

Comment author: Vladimir_Nesov 08 February 2011 09:30:27PM *  -1 points [-]

So to summarize, your conclusion seems to be that we should build an arbitrary-goals AI as soon as possible.

Edit: Wrong, corrected here.

Comment author: Perplexed 08 February 2011 09:44:47PM 1 point [-]

So to summarize, you conclusion seems to be that we should build an arbitrary-goals AI as soon as possible.

Huh? What exactly do you think you are summarizing? If you want to produce a cartoon version of my opinions on this thread, try "We should do all we can to avoid the FOOMing singleton scenario, instead trying to create a society of reproducing AIs, interlocked with each other and with humanity by a network of dependencies. If we do, the details of the initial goal systems may matter less than they would with a singleton."

Comment author: Vladimir_Nesov 08 February 2011 10:04:00PM 1 point [-]

I see, so "if there is convergence" is not a point of theoretical uncertainty, but something that depends on the way the AIs are built. Makes sense (as a position, not something I agree with).

But the whole point of my posting was that, if there is convergence (in the second sense) then those initial values may make very little difference in the outcome of the universe

Comment author: Perplexed 08 February 2011 10:36:16PM *  0 points [-]

I see, so "if there is convergence" is not a point of theoretical uncertainty, but something that depends on the way the AIs are built.

Well, it is both. Convergence in the sense of "outcome is independent of the starting point" has not been proved for any AI/updating architecture. Also, I strongly suspect that the detailed outcome will depend quite a bit on the way AIs interact and produce successors/self-updates, even if the fact of convergence does not.

Comment author: timtyler 09 February 2011 01:40:24AM 0 points [-]

<cartoon>We should do all we can to avoid the FOOMing singleton scenario, instead trying to create a society of reproducing AIs, interlocked with each other and with humanity by a network of dependencies.</cartoon>

That reminds me of:

"An AGI raised in a box could become dangerously solipsistic, probably better to raise AGIs embedded in the social network..."

Comment author: Perplexed 09 February 2011 05:15:36AM *  0 points [-]

Goertzel's comment doesn't even make sense to me. Why is he placing 'in a box' in contraposition to 'embedded in the social network'. The two issues are orthogonal. AIs can be social or singleton - either in a box or in the real world. ETA: Well, if you mean the human social network, then I suppose a boxed AI cannot participate. Though I suppose we could let some simulated humans into the box to keep the AI company.

Besides, I've never really considered solipsists to be any more dangerous than anyone else.

Comment author: timtyler 09 February 2011 01:55:26PM *  0 points [-]

Besides, I've never really considered solipsists to be any more dangerous than anyone else.

"Now I will destroy the whole world - What a Bokononist says before committing suicide."

Comment author: timtyler 09 February 2011 08:40:40AM *  0 points [-]

Though I suppose we could let some simulated humans into the box [...]

We don't have any half-decent simulated humans, though.