PhilGoetz comments on Only humans can have human values - Less Wrong

34 Post author: PhilGoetz 26 April 2010 06:57PM

You are viewing a comment permalink. View the original post to see all comments and the full post content.

Comments (159)

You are viewing a single comment's thread. Show more comments above.

Comment author: PhilGoetz 28 April 2010 01:40:10PM *  3 points [-]

Upvoted, but -

We can program an FAI with ambitions and curiosity of its own, they will be rooted in our own values and anthropomorphism.

Eliezer needs to say whether he wants to do this, or to save humans. I don't think you can have it both ways. The OS FAI does not have ambitions or curiousity of its own.

But no matter how noble and farsighted the programmers are, to those who don't share the programmers' values, the FAI will be a paperclipper.

I dispute this. The SIAI FAI is specifically designed to have control of the universe as one of its goals. This is not logically necessary for an AI. Nor is the plan to build a singleton, rather than an ecology of AI, the only possible plan.

I notice that some of my comment wars with other people arise because they automatically assume that whenever we're talking about a superintelligence, there's only one of them. This is in danger of becoming a LW communal assumption. It's not even likely. (More generally, there's a strong tendency for people on LW to attribute very high likelihoods to scenarios that EY spends a lot of time talking about - even if he doesn't insist that they are likely.)

Comment author: Nick_Tarleton 29 April 2010 01:24:39AM *  7 points [-]

I dispute this. The SIAI FAI is specifically designed to have control of the universe as one of its goals.

It is widely expected that this will arise as an important instrumental goal; nothing more than that. I can't tell if this is what you mean. (When you point out that "trying to take over the universe isn't utility-maximizing under many circumstances", it sounds like you're thinking of taking over the universe as a separate terminal goal, which would indeed be terrible design; an AI without that terminal goal, that can reason the same way you can, can decide not to try to take over the universe if that looks best.)

I notice that some of my comment wars with other people arise because they automatically assume that whenever we're talking about a superintelligence, there's only one of them. This is in danger of becoming a LW communal assumption. It's not even likely.

I probably missed it in some other comment, but which of these do you not buy: (a) huge first-mover advantages from self-improvement (b) preventing other superintelligences as a convergent subgoal (c) that the conjunction of these implies that a singleton superintelligence is likely?

(More generally, there's a strong tendency for people on LW to attribute very high likelihoods to scenarios that EY spends a lot of time talking about - even if he doesn't insist that they are likely.)

This sounds plausible and bad. Can you think of some other examples?

Comment author: Matt_Simpson 28 April 2010 10:49:55PM 7 points [-]

(More generally, there's a strong tendency for people on LW to attribute very high likelihoods to scenarios that EY spends a lot of time talking about - even if he doesn't insist that they are likely.)

This is probably just availability bias. These scenarios are easy to recall because we've read about them, and we're psychologically primed for them just by coming to this website.

Comment author: thomblake 28 April 2010 01:53:07PM *  4 points [-]

Eliezer needs to say whether he wants to do this

He did. FAI should not be a person - it's just an optimization process.

ETA: link

Comment author: PhilGoetz 29 April 2010 01:23:48AM -1 points [-]

Thanks! I'll take that as definitive.

Comment author: Gavin 28 April 2010 10:32:42PM *  2 points [-]

The assumption of a single AI comes from an assumption that an AI will have zero risk tolerance. It follows from that assumption that the most powerful AI will destroy or limit all other sentient beings within reach.

There's no reason that an AI couldn't be programmed to have tolerance for risk. Pursuing a lot of the more noble human values may require it.

I make no claim that Eliezer and/or the SIAI have anything like this in mind. It seems that they would like to build an absolutist AI. I find that very troubling.

Comment author: mattnewport 28 April 2010 11:04:36PM *  -1 points [-]

I make no claim that Eliezer and/or the SIAI have anything like this in mind. It seems that they would like to build an absolutist AI. I find that very troubling.

If I thought they had settled on this and that they were likely to succeed I would probably feel it was very important to work to destroy them. I'm currently not sure about the first and think the second is highly unlikely so it is not a pressing concern.

Comment author: thomblake 28 April 2010 02:19:48PM 1 point [-]

I dispute this. The SIAI FAI is specifically designed to have control of the universe as one of its goals. This is not logically necessary for an AI. Nor is the plan to build a singleton, rather than an ecology of AI, the only possible plan.

It is, however, necessary for an AI to do something of the sort if it's trying to maximize any sort of utility. Otherwise, risk / waste / competition will cause the universe to be less than optimal.

Comment author: PhilGoetz 29 April 2010 01:19:47AM *  0 points [-]

Trying to take over the universe isn't utility-maximizing under many circumstances: if you have a small chance of succeeding, or if the battle to do so will destroy most of the resources, or if you discount the future at all (remember, computation speed increases as speed of light stays constant), or if your values require other independent agents.

By your logic, it is necessary for SIAI to try to take over the world. Is that true? The US probably has enough military strength to take over the world - is it purely stupidity that it doesn't?

The modern world is more peaceful, more enjoyable, and richer because we've learned that utility is better maximized by cooperation than by everyone trying to rule the world. Why does this lesson not apply to AIs?

Comment author: Vladimir_Nesov 29 April 2010 05:38:37PM 4 points [-]

Just what do you think "controlling the universe" means? My cat controls the universe. It probably doesn't exert this control in a way anywhere near optimal to most sensible preferences, but it does have an impact on everything. How do we decide that a superintelligence "controls the universe", while my cat "doesn't"? The only difference is in what kind of the universe we have, which preference it is optimized for. Whatever you truly want, roughly means preferring some states of the universe to other states, and making the universe better for you means controlling it towards your preference. The better the universe, the more specifically its state is specified, the stronger the control. These concepts are just different aspects of the same phenomenon.

Comment author: MugaSofer 30 December 2012 09:15:35PM 1 point [-]

Trying to take over the universe isn't utility-maximizing under many circumstances: if you have a small chance of succeeding, or if the battle to do so will destroy most of the resources

Obviously, if you can't take over the world, then trying is stupid. If you can (for example, if you're the first SAI to go foom) then it's a different story.

or if you discount the future at all (remember, computation speed increases as speed of light stays constant), or if your values require other independent agents.

Taking over the world does not require you to destroy all other life if that is contrary to your utility function. I'm not sure what you mean regarding future-discounting; if reorganizing the whole damn universe isn't worth it, then I doubt anything else will be in any case.

Comment author: JoshuaZ 29 April 2010 01:29:21AM *  1 point [-]

It should apply to AIs if you think that there will be multiple AIs that are at roughly the same capability level. A common assumption here is that as soon as there is a single general AI it will quickly improve to the point where it is so far beyond everything else in capability that there capabilities won't matter. Frankly, I find this assumption to be highly questionable and very optimistic about potential fooming rates among other problems, but if one accepts the idea it makes some sense. The analogy might be to the hypothetical situation of the US instead of having just the strongest military but also having monopolies on cheap fusion power, an immortality pill, and having a bunch of superheroes on their side. The distinction between the US controlling everything and the US having direct military control might quickly become irrelevant.

Edit: Thinking about the rate of fooming issue. I'd be really interested if a fast-foom proponent would be willing to put together a top-level post outlining why fooming will happen so quickly.

Comment author: PhilGoetz 29 April 2010 04:17:12AM 1 point [-]

Eliezer and Robin had a lengthy debate on this perhaps a year ago. I don't remember if it's on OB or LW. Robin believes in no foom, using economic arguments.

The people who design the first AI could build a large number of AIs in different locations and turn them on at the same time. This plan would have a high probability of leading to disaster; but so do all the other plans that I've heard.

Comment author: Vladimir_Nesov 29 April 2010 05:41:09PM 3 points [-]
Comment author: JoshuaZ 02 May 2010 06:43:18AM 0 points [-]

Reading now. Looks very interesting.

Comment author: CronoDAS 29 April 2010 01:40:04AM 1 point [-]

By your logic, it is necessary for SIAI to try to take over the world. Is that true? The US probably has enough military strength to take over the world - is it purely stupidity that it doesn't?

For one, the U.S. doesn't have the military strength. Russia still has enough nuclear warheads and ICBMs to prevent that. (And we suck at being occupying forces.)

Comment author: PhilGoetz 29 April 2010 04:13:03AM -2 points [-]

I think the situation of the US is similar to a hypothesized AI. Sure, Russia could kill a lot of Americans. But we would probably "win" in the end. By all the logic I've heard in this thread, and in others lately about paperclippers, the US should rationally do whatever it has to to be the last man standing.

Comment author: JoshuaZ 29 April 2010 04:16:05AM *  2 points [-]

Well, also the US isn't a single entity that agrees on all its goals. Some of us for example place a high value on human life. And we vote. Even if the leadership of the United States wanted to wipe out the rest of the planet, there would be limits to how much they could do before others would step in.

Also, most forms of modern human morality strongly disfavor large scale wars simply to impose one's views. If our AI doesn't have that sort of belief then that's not an issue. And if we restrict ourselves to just the issue of other AIs, I'm not sure if I gave a smart AI my morals and preferences it would necessarily see anything wrong with making sure that no other general smart AIs were created.

Comment author: mattnewport 29 April 2010 05:10:28AM *  5 points [-]

Well, also the US isn't a single entity that agrees on all its goals.

I think it is quite plausible that an AI structured with a central unitary authority would be at a competitive disadvantage with an AI that granted some autonomy to sub systems. This at least raises the possibility of goal conflicts between different sub-modules of an efficient AI. There are many examples in nature and in human societies of a tension between efficiency and centralization. It is not clear that an AI could maintain a fully centralized and unified goal structure and out-compete less centralized designs.

An AI that wanted to control even a relatively small region of space like the Earth will still run into issues with the speed of light when it comes to projecting force through geographically dispersed physical presences. The turnaround time is such that decision making autonomy would have to be dispersed to local processing clusters in order to be effective. Hell, even today's high end processors run into issues with the time it takes an electron to get from one side of the die to the other. It is not obvious that the optimum efficiency balance between local decision making autonomy and a centralized unitary goal system will always favour a singleton type AI.

There is some evidence of evolutionary competition between different cell lines within a single organism. Human history is full of examples of the tension between centralized planning and less centrally coordinated but more efficient systems of delegated authority. We do not see a clear unidirectional trend towards more centralized control or towards larger conglomerations of purely co-operating units (whether they be cells, organisms, humans or genes) in nature or in human societies. It seems to me that the burden of proof is on those who would propose that a system with a unitary goal structure has an unbounded upper physical extent of influence where it can outcompete less unitary arrangements (or even that it can do so over volumes exceeding a few meters to a side).

There is a natural tendency for humans to think of themselves as having a unitary centralized consciousness with a unified goal system. It is pretty clear that this is not the case. It is also natural for programmers trained on single threaded Von-Neumann architectures or those with a mathematical bent to ignore the physical constraints of the speed of light when imagining what an AI might look like. If a human can't even catch a ball without delegating authority to a semi-autonomous sub-unit I don't see why we should be confident that non human intelligences subject to the same laws of physics should be immune to such problems.

Comment author: JGWeissman 29 April 2010 05:26:20AM 0 points [-]

This at least raises the possibility of goal conflicts between different sub-modules of an efficient AI.

A well designed AI should have an alignment of goals between sub modules that is not achieved in modern decentralized societies. A distributed AI would be like multiple TDT/UDT agents with mutual knowledge that they are maximizing the same utility function, not a bunch of middle managers engaging in empire building at the expense of the corporation they work for.

This is not even something that human AI designers have to figure out how to implement, the seed can be single agent, and it will figure out the multiple sub agent architecture when it needs it over the course of self improvement.

Comment author: mattnewport 29 April 2010 05:30:15AM 0 points [-]

Even if this is possible (which I believe is still an open problem, if you think otherwise I'm sure Eliezer would love to hear from you) you are assuming no competition. The question is not whether this AI can outcompete humans but whether it can outcompete other AIs that are less rigid.

Comment author: LucasSloan 29 April 2010 05:19:17AM 0 points [-]

It is not obvious that the optimum efficiency balance between local decision making autonomy and a centralized unitary goal system will always favor a singleton type AI.

I agree that it would probably make a lot of sense for an AI who wished to control any large area of territory to create other AIs to manage local issues. However, AIs, unlike humans or evolution can create other AIs which share perfectly its values and interests. There is no reason to assume that an AI would create another one, which it intends to delegate substantial power to, which it could get into values disagreements with.

Comment author: mattnewport 29 April 2010 05:27:29AM 0 points [-]

However, AIs, unlike humans or evolution can create other AIs which share perfectly its values and interests.

This is mere supposition. You are assuming the FAI problem is solvable. I think both evolutionary and economic arguments weigh against this belief. Even if this is possible in theory it may take far longer for a singleton AI to craft its faultlessly loyal minions than for a more... entrepreneurial... AI to churn out 'good enough' foot soldiers to wipe out the careful AI.

Comment author: LucasSloan 29 April 2010 05:29:36AM 1 point [-]

This is mere supposition.

No. All an AI needs to do to create another AI which shares its values is to copy itself.