xxd comments on What if AI doesn't quite go FOOM? - Less Wrong

11 Post author: Mass_Driver 20 June 2010 12:03AM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (186)

You are viewing a single comment's thread. Show more comments above.

Comment author: xxd 30 November 2011 06:29:34PM 1 point [-]

Your version is exactly the same as Phil's, just that you've enlarged it to include yourself and everyone you care about's utility being maximized rather than humanity as a whole having it's utility maximized.

When we actually do get an FAI (if) it is going to be very interesting to see how it resolves given that even among those who are thinking about this ahead of time we can't even agree on the goals defining what FAI should actually shoot for.

Comment author: TheOtherDave 30 November 2011 06:42:17PM 4 points [-]

I do not understand what your first sentence means.

As for your second sentence: stating what it is we value, even as individuals (let alone collectively), in a sufficiently clear and operationalizable form that it could actually be implemented, in a sufficiently consistent form that we would want it implemented, is an extremely difficult problem. I have yet to see anyone come close to solving it; in my experience the world divides neatly into people who don't think about it at all, people who think they've solved it and are wrong, and people who know they haven't solved it.

If some entity (an FAI or whatever) somehow successfully implemented a collective solution it would be far more than interesting, it would fundamentally and irrevocably change the world.

I infer from my reading of your tone that you disagree with me here; the impression I get is that you consider the fact that we haven't agreed on a solution to demonstrate our inadequacies as problem solvers, even by human standards, but that you're too polite to say so explicitly. Am I wrong?

Comment author: xxd 30 November 2011 07:22:49PM 5 points [-]

We actually agree on the difficulty of the problem. I think it's very difficult to state what it is that we want AND that if we did so we'd find that individual utility functions contradict each other.

Moreover, I'm saying that maximizing Phil Goetz's utility function or yours and everybody you love (or even my own selfish desires and wants plus those of everyone I love) COULD in effect be an unfriendly AI because MANY others would have theirs minimized.

So I'm saying that I think a friendly AI has to have it's goals defined as: Choice A. the maximum number of people have their utility functions improved (rather than maximized) even if some minimized number of people have their utility functions worsened as opposed to Choice B. a small number having their utility functions maximized as opposed to a large number of people having their utility functions decreased (or zeroed out).

As a side note: I find it amusing that it's so difficult to even understand each others basic axioms never mind agree on the details of what maximizing the utility function for all of us as a whole means.

To be clear: I don't know what the details are of maximizing the utility function for all of humanity. I just think that a fair maximization of the utility function for everyone has an interesting corrollary: In order to maximize the function for everyone, some will have their individual utility functions decreased unless we accept a much narrower definition of friendly meaning "friendly to me" in which case as far as I'm concerned that no longer means friendly.

The logical tautology here is of course that those who consider "friendly to me" as being the only possible definition of friendly would consider an AI that maximized the average utility function of humanity and they themselves lost out, to be an UNfriendly AI.

Comment author: TheOtherDave 30 November 2011 07:59:36PM *  6 points [-]

Couple of things:

  • If you want to facilitate communication, I recommend that you stop using the word "friendly" in this context on this site. There's a lot of talk on this site of "Friendly AI", by which is meant something relatively specific. You are using "friendly" in the more general sense implied by the English word. This is likely to cause rather a lot of confusion.

  • You're right that if strategy 1 optimizes for good stuff happening to everyone I care and strategy 2 optimizes for good stuff happening to everyone whether I care about them or not, then strategy 1 will (if done sufficiently powerfully) result in people I don't care about having good stuff taken away from them, and strategy 2 will result in everyone I care about getting less good stuff than strategy 1 will.

  • You seem to be saying that I therefore ought to prefer that strategy 2 be implemented, rather than strategy 1. Is that right?

  • You seem to be saying that you yourself prefer that strategy 2 be implemented, rather than strategy 1. Is that right?