Having all known life on Earth concentrated on a single planet is an existential risk.  So we should try to spread out, right?  As soon as possible?

Yet, if we had advanced civilizations on two planets, that would be two places for unfriendly AI to originate.  If, as many people here believe, a single failed trial ruins the universe, you want to have as few places trying it as possible.  So you don't want any space colonization until after AI is developed.

If we apply that logic to countries, you would want as few industrialized nations as possible until AAI (After AI).  So instead of trying to help Africa, India, China, and the Middle East develop, you should be trying to suppress them.  In fact, if you really believed the calculations I commonly see used in these circles about the probability of unfriendly AI and its consequences, you should be trying to exterminate human life outside of your developed country of choice.  Failing to would be immoral.

And if you apply it within the USA, you need to pick one of MIT and Stanford and Carnegie Mellon, and burn the other two to the ground.

Of course, doing this will slow the development of AI.  But that's a good thing, if UFAI is most likely and has zero utility.

In fact, if slowing development is good, probably the best thing of all is just to destroy civilization and stop development completely.

Do you agree with any of this?  Is there a point where you think it goes too far?  If so, say where it goes too far and explain why.

I see two main flaws in the reasoning.

  • Categorization of outcomes as "FAI vs UFAI", with no other possible outcomes recognized, and no gradations within the category of either, and zero utility assigned to UFAI.
  • Failing to consider scenarios in which multiple AIs can provide a balance of power.  The purpose of this balance of power may not be to keep humans in charge; it may be to put the AIs in an AI society in which human values will be worthwhile.
  • ADDED, after being reminded of this by Vladimir Nesov:  Re. the final point, stopping completely guarantees Earth life will eventually be eliminated; see his comment below for elaboration.

ADDED:  A number of the comments so far imply that the first AI built will necessarily FOOM immediately.  FOOM is an appealing argument.  I've argued in favor of it myself.   But it is not a theorem.  I don't care who you are; you do not know enough about AI and its future development to bet the future of the universe on your intuition that non-FOOMing AI is impossible.  You may even think FOOM is the default case; that does not make it the only case to consider.  In this case, even a 1% chance of a non-foom AI, multiplied by astronomical differences in utility, could justify terrible present disutility.

New Comment
50 comments, sorted by Click to highlight new comments since: Today at 7:25 AM

The number of distinct locations that humans are active isn't what impacts the chance of uFAI arising but rather the number of people who are programming things which could potentially do so. How the people are spread out isn't very relevant if fooming (regardless of the exact definition of foom) is a serious worry.

Different muncipalities have different regulatory regimes and different attitudes. If the US develops a cautious approach to AI, and China has a "build it before the damn Yankees do" approach, that's significant.

Saying that some locations are better than others is not an argument for reducing the number of locations unless you have reason to believe the current location is better than the average would be, and I see no reason to believe that.

Suppose I were the political head of an at least somewhat industrialized nation with a credible military force. If my intelligence agencies suggested to me that there was a nascent movement in some stronger foreign country to forcibly de-industrialize my proud native land, I would do anything possible to prevent it. I'd draft my country's top scientists to work on anything that would have military value. I'd make sure they had an enormous budget to build the most advanced supercomputers the world has yet seen, if they asked for it. And I wouldn't take any back-talk from them about delaying the development of our country's defensive systems to conform to some need for "friendliness." I didn't get to be the head of this great nation by pussyfooting around.

If my intelligence agencies suggested to me that there was a nascent movement in some stronger foreign country to forcibly de-industrialize my proud native land, I would do anything possible to prevent it.

So what would you do if your intelligence told you that there is a group of people who want to launch a fooming AI and take over the universe? That is similar to what the SIAI is planning to do (as interpreted by politicians). There seems to be no way around large-scale politico-military struggle.

[Speaking in character] I'm held responsible for building this nation's trade, for protecting it from those who would be our enemies from abroad, all while deftly managing potential threats from my own political rivals at home. I am very busy. My staff knows better than to bother me with fevered dreams of a few science fiction devotees.

It's different when some crazy group or ideology starts exercising political power. If some foreign cult starts lobbying its government or to reduce my people to serfs, or trying to take power itself, I will pay attention. That is a threat I can recognize, having seen it repeated in history again and again.

Obviously, I don't want open war with a great power if it can be avoided. So I would instruct my diplomats and the public relations staff (who used to be called the propaganda ministry a long time ago) to initiate a subtle campaign to portray this strange science fiction group as a dangerous cult, mad for power. Of course they want to have a monopoly over computing power. I would embarrass their host government, suggesting that we smaller countries cannot help to but see the growing prominence of this group as a mounting threat of imperialist conquest. I would want assurances that, even if these people are allowed to remain free and spout their venom, none of them will be regarded as respectable or ever hold a position of authority.

If I were persuaded there really is "no way around large-scale politico-military struggle." I would consider this thing they call "cyber-warfare" as a form of asymmetric resistance, in addition to training my conventional forces to repel a larger invader. I hope we would win. But if our defeat looks imminent, I would make preparations to see that my nation's technical infrastructure could survive underground in the event of a foreign occupation. We do not relish the idea of being reduced to nothing more than agricultural peasants. If it's our computers they want, then we will guard our computers all the more.

[-]XiXiDu13y-10

I agree that there is likely no risk if nobody takes the SIAI seriously. But if at some point a powerful entity does take it serious, especially the country where most of the SIAI members reside in, then the best case scenario might be that it will be put under government control. I just don't see that a government that does take fooming AI serious would trust the SIAI in trying to implement friendliness, or would allow them to do it even if they believed so.

I agree that there is likely no risk if nobody takes the SIAI serious. But if at some point a powerful entity does take it serious

Which is why it is smart for the figurehead to spend his time writing anime and Harry Potter fanfics and making them public. That stops most people taking him seriously (taking uncool people feels like a political mistake) while for rationalist nerd types it is approximately neutral.

It's 'seriously', not 'serious'.

Thanks, I know you told me before.

So what would you do if your intelligence told you that there is a group of people who want to launch a fooming AI and take over the universe?

If it was me, I would probably ask whether it was a bunch of young folks with few resources, not much programming experience and a fondness for anime.

In fact, if slowing development is good, probably the best thing of all is just to destroy civilization and stop development completely.

Possibly a good idea (when you put this as a Trolley problem, with the whole of future potential on the other side), but too difficult to implement in a way that gives advantage to future development of FAI (otherwise you just increase existential risk if civilization never recovers, or replay the same race as we face now).

Also, depending on temporal discounting, even a perfect plan that trades current humanity for future FAI with certainty could be incorrect, so we'd prefer to keep present humanity and reject the future FAI. If there's no discounting, then FAI is the better choice, but we don't really know.

Upvoted and I mostly agree, but there's one point I don't get. I though temporal discounting was considered a bias. Is it not necessarily one?

The single fact that I value a candy today slightly more than I value a candy tomorrow doesn't make my utility function inconsistent (AFAIK), so it's not a bias.

In practice, temporal discounting usually arises "naturally" in any case, because we tend to be less sure of events further in the future and so their expected utility is lower.

Possibly a good idea (when you put this as a Trolley problem, with the whole of future potential on the other side), but too difficult to implement in a way that gives advantage to future development of FAI (otherwise you just increase existential risk if civilization never recovers, or replay the same race as we face now).

Very good answer.

Also, depending on temporal discounting, even a perfect plan that trades current humanity for future FAI with certainty could be incorrect, so we'd prefer to keep present humanity and reject the future FAI.

Also a good point.

ADDED: A number of the comments so far imply that the first AI built will necessarily FOOM immediately. FOOM is an appealing argument. I've argued in favor of it myself. But it is not a probability one theorem. I don't care who you are; you do not know enough about AI and its future development to bet the future of the universe on your intuition that non-FOOMing AI is impossible. You may even think FOOM is the default case; that does not make it the only case to consider.

Supposing that the first AI build doesn't FOOM, I still see no reason to suppose that adding colonies increases the overall danger. At most, it increases the population so you have less time, but the same number of person-hours, before some strong AI is created.

What we want to do is increase the percentage of AI efforts that are aware and cautious of the dangers of uFAI relative to all AI efforts that have a chance of succeeding (probably weighed against their chance of succeeding).

Burn everything to the ground doesn't help; we get 0/0 on Earth, and haven't changed the numbers in the rest of the universe.

Suppressing technological developments in other countries might help, but it is probably less efficient than just targeting AI efforts and policy makers at increasing that percentage. If the SIAI has a good chance of being the first to succeed at AI (I don't think it does), then stomping out other efforts might be worthwhile, but since it's an underfunded underdog, focusing on education and awareness seems to be a better use of resources. I believe this is what they're doing though.

Ultimately, this seems to me to be the only way significantly changing the percentage in our Hubble volume, since as a civilization we're competing against all the others.

In addition to the argument already raised, that it doesn't matter how many colonies are working on it, only how many people, I think it's worth pointing out that if a friendly AI was created on any colony, it should also prevent an unfriendly one from developing on other colonies. It's not clear that there's any reason to suppose it tips the likelihood of future strong AI being friendly in either direction.

If we suppose that the time to FOOM is a function of how many people are working on it, and that colonizing other planets would increase the total population, then we should probably anticipate the same number of person-years to FOOM either way.

Do you agree with any of this?

I disagree with all of it, up to, but not including, the part where you destroy civilization.

Analogy: you are watching the annual Easter egg hunt when terrorist mastermind Ibn al Omega tells you that if the first child to find an egg finds a speckled egg then he will blow up the city. It makes no sense for you to kill only some of the children - you need to kill all of them.

Personally, I think that the weakness in my analogy is the assumption that kids cannot be trusted to ignore the speckled eggs, even if you explain the danger to them. And, of course, the assumption that there is no way to just overpower Ibn al Omega.

The SIAI solution, as I understand it, is to place a beautiful solid blue egg in plain sight right near the starting line. That is a good idea too.

I don't see why you disagree. The OP shows what happens if you take ideas too serious. You can justify any atrocities with the right variables in your probability and utility calculations. If you would have to kill most humans now to have 1000 years to come up with the solution to FAI and by that act gain 10^100 years of paradise, then I think some people here would follow through on it.

I disagreed with the idea of disrupting randomly chosen research efforts, because I don't see how this improves the chances that the first AI will be friendly, and only pushes back the date of the singularity by a few years. I somewhat facetiously exempted the idea of causing a collapse of civilization from my disapproval, because I see the Friendliness problem as mathematical/philosophical whereas the AGI problem probably also has a technological component. So I imagine that a collapse would set back AGI research farther than Friendliness research.

I agree with your horror at the potential atrocities that can be justified once people start worrying about optimizing undiscounted distant future utilities. A failure to discount is, so far as I can tell, the source of almost all of the insanity in this community's take on the Singularity. Nonetheless, as I say in my last paragraph, working to make sure that the first superhuman AGI is friendly is pretty damned good idea.

A failure to discount is, so far as I can tell, the source of almost all of the insanity in this community's take on the Singularity.

Probably the main idea the area here is the proposal that - within a few decades, not long after we get powerful machine intelligence, something really, really bad might happen - and that we can influence whether it will or not.

I might differ considerably on the p()s and details - but I think that proposition is a reasonable one.

Since the event in question will probably be within the lifetimes of many here, I think it is close enough for many people's temporal discounting to leave it partially intact.

Agree / disagree?

Agree / disagree?

Don't really understand the question. Our expectations about what happens are not affected by whether we discount or not. The probability I assign to the event "FOOMing AI within 40 years" is the same regardless of what discount rate I use. Same goes for the probability I assign to the proposition that "The first superhuman AI will tile the universe with paperclips." Or the proposition that "The first superhuman AI will tile the universe with happy humans."

What discounting or not discounting changes is how I feel about those possibilities. If I use a low discount rate, the future means a lot to me, and I should most rationally sell all I own and contribute it to the SIAI, pretty much however little I think of SIAI effectiveness. If I use a higher discount rate, then it is possible that I care more about what happens in the next 40 years than I do about anything that happens after 2050. I don't see uFAI all that far out as such a horrible disaster. And I don't see FAI as incredibly awesome either, if it doesn't appear quickly. I would be much more impressed to see a cure for malaria next year.

Our expectations about what happens are not affected by whether we discount or not.

Of course. What I was trying to get at was whether a few decades is too far away for you, or whether those ideas are not what you mean - and are talking about some other "insanity" to do with events further out in the future.

A few decades is not that far out - for many people.

You are still missing my point. The insanity has to do with utilities farther out in the future, not events farther out in the future. 'Insane' people and AGI's care about those utilities, care a lot. Me, and most other people, not so much.

Most people will worry about the happiness of their grandchildren, few really care about their great^n grandchildren when n rises to double-digits. And even if they do care about future generations on a par with the next one, they probably normalize for population size so that they don't think that future generations collectively are more important than current ones.

You are still missing my point. The insanity has to do with utilities farther out in the future, not events farther out in the future. 'Insane' people and AGI's care about those utilities, care a lot. Me, and most other people, not so much.

The utilities you would calculate are not big enough for "normal" people to worry about? This is the end of the human race in a few decades we are talking about - right?

If you ignore the possibility of "insane" astronomical waste, that would still be a matter of some concern to many - no?

Really, those seem like the main flaws?

I suspect I assign much more utility to FAI than you do. A slightly increased chance of success there, from having more intelligent people thinking about related questions, can make a large contribution to the expected value from my perspective. Now of course uFAI has a strongly negative utility, but we have to set that against the chance of humanity dying without any form of AI. The latter seems close to 1 in the long term (while post-FAI humanity has a greater chance of finding a way around entropy.) Even on more human timescales, it seems disturbingly large. This combined with the downsides of all your suggestions goes a long way towards removing the paradox.

I also assume the SIAI and people who agree with their take have a real chance of persuading the people who'd otherwise have the greatest chance of making an uFAI. You may disagree with this, of course, but if we can't at least persuade a large fraction of the public then most of your suggestions seem pointless. Now if we persuaded the majority, but some group kept working on a new and dangerous-looking project, we'd have to consider what action to take. But violence still wouldn't be my first or second choice.

Upvoted for making me think.

I agree with JoshuaZ's post in that the probability of UFAI creation will increase with the number of people trying to create AGI without concern for Friendliness, and that this is a much better measure of such than the number of locations at which such research takes place.

The world would probably stand a better chance without certain AGI projects, but I don't think that effort put into dismantling such is nearly as efficient as putting effort towards FAI (considering that a FOOM-ing FAI will probably be able to stop future UFAI), especially considering current laws etc. By the way, I don't see why you're talking about eliminating countries and such. People that are not working on AGI have a very low likelyhood of creating UFAI, so I think you would just want to target the projects.

You seem to be using zero utility like I would use 'infinite negative utility.' To me, zero utility means that I don't care in the slightest whether happens or not. With that said, I don't assign infinite negative utility to anything (primarily because it causes my brain to bug out), so the probability of something happening still has a significant effect on the expected utility.

By the way, I don't see why you're talking about eliminating countries and such. People that are not working on AGI have a very low likelyhood of creating UFAI, so I think you would just want to target the projects.

Would you say China has a less than 10^-20 probability of developing UFAI? Or would you assign the utility of the entire future of the roughly 10^23 stars in the universe for the next 10^10 years to be less than 10^20 times the utility of life in China today? You must pick one (modulo time discounting), if you're working within the generic LW existential-risk long-future big-universe scenario.

My point was that there would be no need to kill, say, the guy working in a textile factory. I know that probabilities of zero and one are not allowed, but I feel that I can safely round the chance that he will be directly involved in creating a UFAI to zero. I assume you agree that (negative utility produced by killing all people not working on FAI)>(negative utility produced by killing all people pursuing AGI that are not paying attention to Friendliness), so I think that you would want to take the latter option.

I did not claim that if I had the ability to eliminate all non-Friendly AGI projects I would not do so. (To remove the negatives, I believe that I would do so, subject to a large amount of further deliberation.)

I feel that I can safely round the chance that he will be directly involved in creating a UFAI to zero.

I would explain why I disagree with this, but my ultimate goal is not to motivate people to nuke China. My goal is more nearly opposite - to get people to realize that the usual LW approach has cast the problem in terms that logically justify killing most people. Once people realize that, they'll be more open to alternative ways of looking at the problem.

I don't know whether what I am saying concurs with the 'usual LW approach,' but I would very quickly move past the option of killing most people.

If we currently present ourselves with two options (letting dangerous UFAI projects progress and killing lots of people), then we should not grimace and take whichever choice we deem slightly more palatable -- we should instead seek a third alternative.

In my eyes, this is what I have done -- shutting down AGI projects would not necessitate the killing of large numbers of people, and perhaps a third alternative could be found to killing even one. To maintain that the premise "rapidly self-improving UFAI will almost certainly kill us all, if created" leads to killing most people, you must explain why, indeed, killing most people would reduce the existential risk presented by UFAI significantly more than would completely shutting down UFAI projects.

Edit: For clarification purposes, I do not believe that shutting down UFAI projects is the best use of my time. The above discussion refers to a situation in which people are much closer to creating UFAI than FAI and will continue to be given expected rate of progress.

[-][anonymous]13y00

Do you agree with any of this? Is there a point where you think it goes too far? If so, say where it goes too far and explain why.

  • Ethics. You can't take for granted that the actions in question will have the effects you claim, and that neither they nor the disposition to take them have comparable costs.

Less importantly:

  • Simple opportunity cost. Most of those don't sound like very efficient ways to buy x-risk reduction. (I realize the ways in which this objection is contingent on limited resources etc.)
  • Destroying civilization would entirely defeat the point, for most plausible value systems.

Ethics is a virtue that is kept in service of utility optimization, and should bow before an explicit argument of sufficient quality.

(I posted the parent, and deleted it as not satisfactorily saying what I wanted to, not realizing that there was a reply. Mea culpa.)

Interesting thing is that replies to deleted comments are always by Vladimir Nesov.

How on earth would colonising other planets increase the chance of UFAI? It wouldn't increase the number of people, and therefore wouldn't increase the number of people capable of creating AI, or the number of AI projects.

In fact, I don't see why having more than one AI team makes things worse. If we have a foom then the first team to finish makes all the others irrelevant, and if we don't have a foom then AIs aren't nearly as much of a danger so having more teams is probably better.

As for destroying human civilisation, you might as well argue that anyone who doesn't want to be run over by a car should commit suicide right now since that's the best way to do so. Assigning zero utility to UFAI does not mean we focus on nothing but avoiding it, in fact assigning zero utility to UFAI doesn't mean anything at all since utility functions are unchanged by affine transformations.

As for balance of power, even if it is possible (or a remotely good thing) it would require at least two AIs to foom at pretty much exactly the same time, which seems frankly unlikely.

How on earth would colonising other planets increase the chance of UFAI? It wouldn't increase the number of people, and therefore wouldn't increase the number of people capable of creating AI, or the number of AI projects.

At the present time, colonising other plants probably would not increase the chance of UFAI, because we will probably develop AI before colonized planets would develop to the point of competing with Earth in R&D.

Assigning zero utility to UFAI does not mean we focus on nothing but avoiding it, in fact assigning zero utility to UFAI doesn't mean anything at all since utility functions are unchanged by affine transformations.

It means that all "UFAI" outcomes are considered equivalent. Instead of asking what the real utility function is, you just make these two categories, FAI and UFAI, and say "UFAI bad", and thus don't have to answer detailed questions about what specifically is bad, and how bad it is.

Also, your statement is not correct. When someone says "a utility function is unchanged by affine transformations", what they mean is that the outcome of a decision process using that utility function will be the same. And that is not true if we define zero utility as "that utility level at which I am indifferent to life." An outcome leading to utility epsilon for eternity has infinite utility. An outcome leading to utility minus epsilon means it is better to destroy ourselves, or the universe.

As for balance of power, even if it is possible (or a remotely good thing) it would require at least two AIs to foom at pretty much exactly the same time, which seems frankly unlikely.

If you assume FOOM is the only possible outcome. See the long debate between Eliezer & Robin.

At the present time, colonising other plants probably would not increase the chance of UFAI, because we will probably develop AI before colonized planets would develop to the point of competing with Earth in R&D.

Another good reason why number of colonized planets is irrelevant. If you agree with me about that then why did you mention it?

It means that all "UFAI" outcomes are considered equivalent. Instead of asking what the real utility function is, you just make these two categories, FAI and UFAI, and say "UFAI bad", and thus don't have to answer detailed questions about what specifically is bad, and how bad it is.

Assigning one utility value to all UFAI outcomes is obviously stupid, which is why I don't think anyone does it (please stop strawmanning). What some people (including myself) do assume is that at their state of knowledge they have no way of telling which UFAI projects will work out better than others so they give them all the same *expected" utility.

You claim that this is a mistake, and that it has lead to the disturbing conclusions you reach. I cannot see how assuming more than one UFAI possibility has any effect on your argument, since any of the policies you suggest could still be 'justified' on the grounds of avoiding the worst kind of UFAI. There are plenty of mistakes in your whole argument, no need to assume the existence of another one.

Also, your statement is not correct. When someone says "a utility function is unchanged by affine transformations", what they mean is that the outcome of a decision process using that utility function will be the same.

I am aware that that is what it means. Since the only purpose of utility functions is to determine the outcomes of decision processes to say an outcome is assigned "zero utility" without giving any other points on the utility function is to make a meaningless statement.

And that is not true if we define zero utility as "that utility level at which I am indifferent to life." An outcome leading to utility epsilon for eternity has infinite utility. An outcome leading to utility minus epsilon means it is better to destroy ourselves, or the universe.

You and I seem to be using different domains for our utility functions. Whereas yours is computed over instants of time mine is computed over outcomes. I may be biased here but I think mine is better on the grounds of not leading to infinities (which tend to screw up expected utility calculations).

If you assume FOOM is the only possible outcome. See the long debate between Eliezer & Robin.

I have seen it, and I agree that it is an interesting question with no obvious answer. However, since UFAI is not really much of a danger unless FOOM is possible, your whole post is only really relevant to FOOM scenarios.

In fact, if slowing development is good, probably the best thing of all is just to destroy civilization and stop development completely.

This is a true statement. The rest is just Zeno's paradox. Did you know that within MIT alone, there are an infinite number of places? This means that the average AI-development chance per place is 0, so AI will never develop.

EDIT: apparently you weren't quite claiming that "places" are what matter. Still fairly silly though.

In fact, if slowing development is good, probably the best thing of all is just to destroy civilization and stop development completely.

No, UFAI destroying civilization is the thing that is being prevented. Also, the number of attempts at once doesn't change the odds of the first one being Friendly, if all the attempts are the same quality. If any given project is more likely than 50% (or perhaps just more likely than average) to produce FAI, it should be supported. Otherwise, it should be suppressed.

Also, the number of attempts at once doesn't change the odds of the first one being Friendly, if all the attempts are the same quality.

First, the odds of the first one being Friendly are not especially important unless you assume FOOM is the only possible case.

Second, the number of attempts does change the odds of the first one being Friendly, unless you believe that hurried projects are as likely to be Friendly as slow, cautious projects.

My intuition is that the really high expected utility of a positive FOOM and the really low expected utility of a bad one make Friendliness important if it gets even a fairly low probability. But it's true that if all the AIs developed within, say 5 years of the first one get a substantial influence then the situation changes.

50%

Where did that come from?

If FAI is as much better than what we have now as UFAI is worse, than only projects that are more likely to produce FAI than UFAI should be encouraged. So it's 50% conditional on that the project succeeds. A project more likely to produce UFAI than FAI has negative expected payoff; a project more likely to produce FAI has positive expected payoff. If the damage from a failure is not as bad as the gain from success, then the cutoff value is lower than 50%. If the damage from failure is worse, then it's higher. Is that any more clear?

Your argument compares each project to the possibility 'no AI built'. I think it would be better to compare each of them to the possibility 'one of the other projects builds an AI' which means you should make your cut-off point the average project (or a weighted average with weights based on probability of being first).

That's a good point and makes my last two comments kind of moot. Is that why the grandparent was voted down?

The notion of "cutoff value" for a decision doesn't make sense. You maximize expected utility, no matter what the absolute value. Also, "what we have now" is not an option on the table, which is exactly the problem.

By "cutoff value," I mean the likelihood of a project's resulting in FAI that makes it utility-maximizing to support the project. If UFAI has -1000 utility, and FAI has 1000 utility, you should only support a project more likely to produce FAI than UFAI. If UFAI has -4000 utility and FAI only has 1000, then a project with a 51% chance of being friendly is a bad bet, and you should only support one with a > 80% chance of success.

No, UFAI destroying civilization is the thing that is being prevented.

No, UFAI destroying all life is the thing that is being prevented.

The post suggests that guaranteeing continued life (humans and other animals) with low tech may be better than keeping our high tech but risking total extinction.