wedrifid comments on What if AI doesn't quite go FOOM? - Less Wrong

11 Post author: Mass_Driver 20 June 2010 12:03AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (186)

You are viewing a single comment's thread. Show more comments above.

Comment author: wedrifid 21 June 2010 08:42:35AM 2 points [-]

It's as if people compartmentalize them and think about only one or the other at a time.

Or just disagree with a specific transhumanist moral (or interpretation thereof). If you are growing "too powerful too quickly" the right thing for an FAI (or, for that matter, anyone else) to do is to stop you by any means necessary. A recursively self improving PhilGoetz with that sort of power and growth rate will be an unfriendly singularity. Cease your expansion or we will kill you before it is too late.

Comment author: PhilGoetz 23 June 2010 07:06:34PM 2 points [-]

A recursively self improving PhilGoetz with that sort of power and growth rate will be an unfriendly singularity.

How do you infer that? Also, is CEV any better? I will be justly insulted if you prefer the average of all human utility functions to the PhilGoetz utility function.

Comment author: wedrifid 23 June 2010 08:30:22PM *  5 points [-]

You're familiar with CEV so I'll try to reply with the concepts from Eliezer's CEV document.

Defining Friendliness is not the life-or-death problem on which the survival of humanity depends. It is a life-or-death problem, but not the life-or-death problem. Friendly AI requires:

  1. Solving the technical problems required to maintain a well-specified abstract invariant in a self-modifying goal system. (Interestingly, this problem is relatively straightforward from a theoretical standpoint.)
  2. Choosing something nice to do with the AI. This is about midway in theoretical hairiness between problems 1 and 3.
  3. Designing a framework for an abstract invariant that doesn't automatically wipe out the human species. This is the hard part.

PhilGoetz does not have a framework for a well specified abstract invariant self-modifying goal system. If Phil was "seeming to be growing too powerful too quickly" then quite likely the same old human problems are occurring and a whole lot more besides.

The problem isn't with your values, CEV<Phil>, the problem is that you aren't a safe system for producing a recursively self improving singularity. Humans don't even keep the same values when you give them power let alone when they are hacking their brains into unknown territory.

Comment author: dlthomas 16 November 2011 05:09:47PM 0 points [-]

When talking about one individual, there is no C in CEV.

Comment author: wedrifid 16 November 2011 05:24:36PM *  0 points [-]

I use 'extrapolated volition' when talking about the outcome of the process upon an individual. "Coherent Extrapolated Volition" would be correct but redundant. When speaking of instantiations of CEV with various parameters (of individuals, species or groups) it is practical, technically correct and preferred to write CEV<X> regardless of the count of individuals in the parameter. Partly because it should be clear that CEV<wedridid> and CEV<all nerds> are talking about things very similar in kind. Partly because if people see "CEV" and google it they'll find out what it means. Mostly because the 'EV' acronym is overloaded within the nearby namespace.

AVERAGE(3.1415) works in google docs. It returns 2.1415. If you are comparing a whole heap of aggregations of a feature, some of which only have one value, it is simpler to just use the same formula.

Comment author: dlthomas 17 November 2011 01:55:19AM 1 point [-]

Seems reasonable.

Comment author: Blueberry 23 June 2010 07:57:54PM 3 points [-]

I will be justly insulted if you prefer the average of all human utility functions to the PhilGoetz utility function.

I think I'd prefer the average of all human utility functions to any one individual's utility function; don't take it personally.

Comment author: ata 23 June 2010 08:30:03PM *  7 points [-]

Is that Phil Goetz's CEV vs. all humans' CEV, or Phil Goetz's current preferences or behaviour-function vs. the average of all humans' current preferences or behaviour-functions? In the former scenario, I'd prefer the global CEV (if I were confident that it would work as stated), but in the latter, even without me remembering much about Phil and his views other than that he appears to be an intelligent educated Westerner who can be expected to be fairly reliably careful about potentially world-changing actions, I'd probably feel safer with him as world dictator than with a worldwide direct democracy automatically polling everyone in the world on what to do, considering the kinds of humans who currently make up a large majority of the population.

Comment author: Nick_Tarleton 29 June 2010 03:46:05PM 3 points [-]

Is that Phil Goetz's CEV vs. all humans' CEV, or Phil Goetz's current preferences or behaviour-function vs. the average of all humans' current preferences or behaviour-functions?

Voted up for distinguishing these things.

Comment author: Blueberry 23 June 2010 10:50:36PM *  2 points [-]

"I am obliged to confess I should sooner live in a society governed by the first two thousand names in the Boston telephone directory than in a society governed by the two thousand faculty members of Harvard University. " -- William F. Buckley, Jr.

Comment author: ata 23 June 2010 11:10:21PM 1 point [-]

Yes, I agree that William F. Buckley, Jr. probably disagrees with me.

Comment author: Blueberry 23 June 2010 11:44:35PM *  1 point [-]

Heh. I figured you'd heard the quote: I just thought of it when I read your comment.

I agree with Buckley, mainly because averaging would smooth out our evolved unconscious desire to take power for ourselves when we become leaders.

Comment author: wedrifid 25 June 2010 05:35:22AM 0 points [-]

I can't agree with that. I've got a personal bias against people with surnames starting 'A"!

Comment author: xxd 30 November 2011 04:50:01PM 0 points [-]

Phil: an AI who is seeking resources to further it's own goals at the expense of everyone else is by definition an unfriendly AI.

Transhuman AI PhilGoetz is such a being.

Now consider this: I'd prefer the average of all human utility function over my maximized utility function even if it means I have less utility.

I dont want humanity to die and I am prepared to die myself to prevent it from happening.

Which of the two utility functions would most of humanity prefer hmmmmm?

Comment author: TheOtherDave 30 November 2011 05:06:51PM 2 points [-]

If you would prefer A over B, it's very unclear to me what it means to say that giving you A instead of B reduces your utility.

Comment author: xxd 30 November 2011 05:28:36PM *  -1 points [-]

It's not unclear at all. Utility is the satisfaction of needs.

Comment author: TheOtherDave 30 November 2011 05:49:34PM 3 points [-]

Ah. That's not at all the understanding of "utility" I've seen used elsewhere on this site, so I appreciate the clarification, if not its tone.

So, OK. Given that understanding, "I'd prefer the average of all human utility function over my maximized utility function even if it means I have less utility." means that xxd would prefer (on average, everyone's needs are met) over (xxd's needs are maximally met). And you're asking whether I'd prefer that xxd's preferences be implemented, or those of "an AI who is seeking resources to further it's own goals at the expense of everyone else" which you're calling "Transhuman AI PhilGoetz"... yes? (I will abbreviate that T-AI-PG herafter)

The honest answer is I can't make that determination until I have some idea what having everyone's needs met actually looks like, and some idea of what T-AI-PGs goals look like. If T-AI-PGs goals happen to include making life awesomely wonderful for me and everyone I care about, and xxd's understanding of "everyone's needs" leaves me and everyone I care about worse off than that, then I'd prefer that T-AI-PG's preferences be implemented.

That said, I suspect that you're taking it for granted that T-AI-PGs goals don't include that, and also that xxd's understanding of "everyone's needs" really and truly makes everything best for everyone, and probably consider it churlish and sophist of me to imply otherwise.

So, OK: sure, if I make those further assumptions, I'd much rather have xxd's preferences implemented than T-AI-PG's preferences. Of course.

Comment author: xxd 30 November 2011 06:29:34PM 1 point [-]

Your version is exactly the same as Phil's, just that you've enlarged it to include yourself and everyone you care about's utility being maximized rather than humanity as a whole having it's utility maximized.

When we actually do get an FAI (if) it is going to be very interesting to see how it resolves given that even among those who are thinking about this ahead of time we can't even agree on the goals defining what FAI should actually shoot for.

Comment author: TheOtherDave 30 November 2011 06:42:17PM 4 points [-]

I do not understand what your first sentence means.

As for your second sentence: stating what it is we value, even as individuals (let alone collectively), in a sufficiently clear and operationalizable form that it could actually be implemented, in a sufficiently consistent form that we would want it implemented, is an extremely difficult problem. I have yet to see anyone come close to solving it; in my experience the world divides neatly into people who don't think about it at all, people who think they've solved it and are wrong, and people who know they haven't solved it.

If some entity (an FAI or whatever) somehow successfully implemented a collective solution it would be far more than interesting, it would fundamentally and irrevocably change the world.

I infer from my reading of your tone that you disagree with me here; the impression I get is that you consider the fact that we haven't agreed on a solution to demonstrate our inadequacies as problem solvers, even by human standards, but that you're too polite to say so explicitly. Am I wrong?

Comment author: xxd 30 November 2011 07:22:49PM 5 points [-]

We actually agree on the difficulty of the problem. I think it's very difficult to state what it is that we want AND that if we did so we'd find that individual utility functions contradict each other.

Moreover, I'm saying that maximizing Phil Goetz's utility function or yours and everybody you love (or even my own selfish desires and wants plus those of everyone I love) COULD in effect be an unfriendly AI because MANY others would have theirs minimized.

So I'm saying that I think a friendly AI has to have it's goals defined as: Choice A. the maximum number of people have their utility functions improved (rather than maximized) even if some minimized number of people have their utility functions worsened as opposed to Choice B. a small number having their utility functions maximized as opposed to a large number of people having their utility functions decreased (or zeroed out).

As a side note: I find it amusing that it's so difficult to even understand each others basic axioms never mind agree on the details of what maximizing the utility function for all of us as a whole means.

To be clear: I don't know what the details are of maximizing the utility function for all of humanity. I just think that a fair maximization of the utility function for everyone has an interesting corrollary: In order to maximize the function for everyone, some will have their individual utility functions decreased unless we accept a much narrower definition of friendly meaning "friendly to me" in which case as far as I'm concerned that no longer means friendly.

The logical tautology here is of course that those who consider "friendly to me" as being the only possible definition of friendly would consider an AI that maximized the average utility function of humanity and they themselves lost out, to be an UNfriendly AI.

Comment author: TheOtherDave 30 November 2011 07:59:36PM *  6 points [-]

Couple of things:

  • If you want to facilitate communication, I recommend that you stop using the word "friendly" in this context on this site. There's a lot of talk on this site of "Friendly AI", by which is meant something relatively specific. You are using "friendly" in the more general sense implied by the English word. This is likely to cause rather a lot of confusion.

  • You're right that if strategy 1 optimizes for good stuff happening to everyone I care and strategy 2 optimizes for good stuff happening to everyone whether I care about them or not, then strategy 1 will (if done sufficiently powerfully) result in people I don't care about having good stuff taken away from them, and strategy 2 will result in everyone I care about getting less good stuff than strategy 1 will.

  • You seem to be saying that I therefore ought to prefer that strategy 2 be implemented, rather than strategy 1. Is that right?

  • You seem to be saying that you yourself prefer that strategy 2 be implemented, rather than strategy 1. Is that right?

Comment author: thomblake 30 November 2011 06:20:26PM 0 points [-]

It's not as clear as you think it is. I'm not familiar with any common definition of "utility" that unambiguously means "the satisfaction of needs", nor was I able to locate one in a dictionary.

"Utility" is used hereabouts as a numerical value assigned to outcomes such that outcomes with higher utilities are always preferred to outcomes with lower utilities. See Wiki:Utility function.

Nor am I familiar with "sophist" used as an adjective.

Comment author: xxd 30 November 2011 07:09:11PM 1 point [-]

Utility is generally meant to be "economic utility" in most discussions I take part in notwithstanding the definition you're espousing for hereabouts.

I believe that the definition of utility you're giving is far too open and could all too easily lead to smiley world.

It is very common to use nouns as adjectives where no distinct adjective already exists and thus saying someone is "sophist" is perfectly acceptable English usage.

Comment author: thomblake 30 November 2011 07:36:16PM 1 point [-]

Utility is generally meant to be "economic utility" in most discussions I take part in notwithstanding the definition you're espousing for hereabouts.

Yeah, that doesn't quite nail it down either. Note Wiktionary:utility (3):

(economics) The ability of a commodity to satisfy needs or wants; the satisfaction experienced by the consumer of that commodity.

It ambiguously allows both 'needs' and 'wants', as well as ambiguous 'satisfaction experienced'.

The only consistent, formal definition of utility I've seen used in economics (or game theory) is the one I gave above. If it was clear someone was not using that definition, I might assume they were using it as more generic "preference satisfaction", or John Stuart Mill's difficult-to-formalize-coherently "pleasure minus pain", or the colloquial vague "usefulness" (whence "utilitarian" is colloquially a synonym for "pragmatic").

Do you have a source defining utility clearly and unambiguously as "the satisfaction of needs"?

Comment author: xxd 30 November 2011 11:32:43PM 2 points [-]

No you're right it doesn't nail it down precisely (the satisfaction of needs or wants).

I do believe, however, that it more precisely nails it down than the wiki on here.

Or on second thoughts maybe not because we again come back to conflicting utilities: a suicidal might value being killed as higher utility than someone who is sitting on death row and doesn't want to die.

And I was using the term utility from economics since it's the only place I've heard where they use "utility function" so I naturally assumed that's what you were talking about since even if we disagree around the edges the meanings still fit the context for the purposes of this discussion.

Comment author: PhilGoetz 06 December 2011 10:16:03PM *  0 points [-]

Phil: an AI who is seeking resources to further it's own goals at the expense of everyone else is by definition an unfriendly AI.

The question is whether the PhilGoetz utility function, or the average human utility function, are better. Assume both are implemented in AIs of equal power. What makes the average human utility function "friendlier"? It would have you outlaw homosexuality and sex before marriage, remove all environmental protection laws, make child abuse and wife abuse legal, take away legal rights from women, give wedgies to smart people, etc.

Now consider this: I'd prefer the average of all human utility function over my maximized utility function even if it means I have less utility.

I don't think you understand utility functions.

Comment author: xxd 16 December 2011 12:27:54AM 0 points [-]

"The question is whether the PhilGoetz utility function, or the average human utility function, are better. "

That is indeed the question. But I think you've framed and stacked the the deck here with your description of what you believe the average human utility function is in order to attempt to take the moral high ground rather than arguing against my point which is this:

How do you maximize the preferred utility function for everyone instead of just a small group?

Comment author: xxd 16 November 2011 04:14:18PM 0 points [-]

Sorry Phil but now we've got a theoretical fight on our hands between my transhuman value set and yours. Not good for the rest of humanity. I'd rather our rulers had values that benefited everybody on average and not skewed towards your value set (or mine) at the expense of everybody else.

Comment author: xxd 16 December 2011 12:29:09AM 0 points [-]

Although I disagree with your heartbreak position I agree with this.

Comment author: dlthomas 16 December 2011 12:46:42AM *  3 points [-]

1) There's no real reason to pull in unrelated threads just because you're talking to the same person.

2) Most of us are pretty sure, based on that other thread, that you misunderstand wedrifid's "heartbreak position".

3) When referencing other posts, it's usually good form to link to them (as above), to make it easier for others to follow.