A Hypothetical Takeover Scenario Twitter Poll

[-]Viliam3y90

It still raises the super interesting question of why we don’t see massively more, among humans, of exactly all the things I would do if I was in the AI’s position. Essentially, a lot of the reason, in my model, that a lot of power plays don’t happen is due to a combination of lack of sufficient skill at execution, lack of ability to do coordination and cooperation and alignment at scale among humans using current tools, and the prospect of massive backfire and retaliation for trying. At core, one’s bandwidth is expensive and limited, and trying to rock the boat too much is dangerous. And also there isn’t much payoff for doing such things, on an individual level, for almost all people, and also they don’t much want to do it and wouldn’t enjoy.

If you put a super interesting question in an article mostly about something else, you risk that the readers will ignore the rest of the article, and focus on the super interesting part! :D

I think the greatest filter for human success is a lack of competence and a lack of desire. (These are related: If you lack the skills, you won't even try, because it is unrealistic. If you don't really want to, you won't bother obtaining the skills.) The relatively simple alternative is to do what most people do.

Then you are limited by having only one body and only 24 hours a day. A lot of that time goes to all kinds of maintenance (you need to sleep, exercise, eat, cook, take care of your finances, stay in contact with people...). If you are very effective, you can still find some time for your project, but it is easy to spend all free time on the maintenance alone, especially if we include emotional maintenance (you also want to relax, have fun...). You randomly get sick, and accidents happen that require your time and attention.

Then there are all kinds of temptations. As a human, you probably want many different things. As you gain resources, more of the desirable things become accessible. Your choice is to either start spending now, or keep accumulating towards ever greater goals. (Would you rather have one marshmallow in your 20s, or hundred marshmallows in your 50s? Note that if your model is wrong, or something unexpected happens, it will be zero marshmallows instead.) Zero-sum competitions can consume unlimited amounts of resources. Sometimes you cannot avoid it; if you want scarce resources, you need to bid for them.

Then you get to the level where you no longer compete against relatively passive environment, but you have active adversaries. This may happen much sooner than you realize. Things that seem like trivial stepping stones to you, may matter a lot to someone else. Your success may activate someone's status-regulating instinct. Your plans suddenly start to fail not because you made a technical mistake, but because someone actively interfered with them (you may never find out who and how). Someone finds a tweet you wrote 20 years ago and ruins your career. You might even get literally killed (the probability depends on what exactly you are competing at, but crazy people can happen anywhere).

...and still, some people overcome all this adversity and become billionaires, CEOs, crime lords, religious leaders, presidents, dictators. Perhaps their proportion in the population matches the difficulty of the task.

To overcome the limitations of your human body, you need other people's help. You can find allies, or you can pay people for their services. But cooperation is difficult. It is not enough to find trustworthy people, they also need to be interested in the same kind of project you are, and be competent at it. If your goal is power, people who desire power are probably especially likely to stab you in the back. The winning strategy is probably to be the one who stabs others in the back first. But not too soon, because by then you haven't accumulated enough resources to be worth fighting over. You need a certain kind of charisma, so that people trust you, as you lead them towards the project that will accomplish your dreams, and... provide an interesting experience for them.

If you pay people for doing things, you still need some kind of minimal competence, otherwise many will be happy to take your money and do a shitty job in return.

(If you try to secure cooperation by making some solemn vow like "we are this together, forever, and if someone betrays the group, we will literally kill them", guess what... someone will try to betray you anyway, then you kill them, then the police figures it out and you spend the rest of your life in prison. Or you avoid the police successfully, but someone starts blackmailing you, or your partners try to get you involved in more crime: "now that we know that we are willing and able to kill in order to secure our success, how about murdering X, Y, and Z, who stand in our way?")

With the hypothetical superhuman AI we can assume that it would have more talents; work harder; work faster e.g. by building more instances of itself; wouldn't have coordination problems with its instances; the instances would be willing to die for the whole. That is world domination on easy mode.

[-]Zvi3y20

If a lot of readers do that? Seems fine with me! Hell, if enough others find it sufficiently interesting I'll happily make that its own post.

[-]johnny_lw3y10

Please do! I've been thinking a lot the past few weeks about how to build a mechanism for coordinated action; it would be great to hear your take on it.

[-]Dagon3y83

the super interesting question of why we don’t see massively more, among humans, of exactly all the things I would do if I was in the AI’s position.

This keeps me up at night. It's ridiculous just how fragile civilization is, and surprising just how little destruction-of-institutional-value in pursuit of individual or group power actually happens. One can make the argument that group cohesion technology has reached the point that some collections of humans are actually ASIs - more powerful and less comprehensible than any single member.

My ASI nightmare is that it just does what corporate-fascist conspiracy theorists think billionaires already do: increase that fragility in order to control more resources, to the detriment of human flourishing. It may eventually lead to actual population collapse or eradication, but it could also be 10,000 years of dystopian serfhood, as the AI (or AIs, depending on how identity works for that kind of agent) explore and take over the universe using their conscious meat-robots for some kinds of general-purpose manipulation tasks.

As self-replicating, self-repairing (to a point), complex-action-capable physical actuators, humans are far cheaper, more capable, more flexible, and more reliable (in some ways) than any mechanical devices in current or visible-future manufacturing technology. Nanotech may change that, but who knows when that will become feasible.

[-]faul_sname3y94

But also I think that if your model doesn't explain why we don't see massively more of that sort of stuff coming from humans, that means your model has a giant gaping hole in the middle of it, and any conclusions you draw from that model should keep in mind that the model has a giant gaping hole in it.

(My model of the world has this giant gaping hole too. I would really love it if someone would explain what's going on there, because as far as I can tell from my own observations, the vulnerable world hypothesis is just obviously true, but also I observe very different stuff than I would expect to observe given the things which convince me that the vulnerable world hypothesis is true).

[-]ProfessorPublius3y10

I consider "6. Aims to kill us all" to be an unnecessary assumption, one that overlooks many existential risks.

People were only somewhat smarter than North America's indigenous horses and other large animals, most of which were wiped out (with our help) long ago. However, eliminating horses and megafauna probably wasn't a conscious aim. Those were most likely inadvertent oopsies, similar to wiping out the passenger pigeon (complete with shooting and eating the last survivor found in the wild) and our other missteps. I can only barely imagine ASI objectives where all the atoms in the universe are required, where human extinction is thus central to the goal. The more plausible worry, to me, is ASI's indifference, where eventually ours would be the anthill that gets stepped on simply because the shortest path includes that step. Same outcome, but a different mechanism.

It's probably important to consider all ASI objectives that may lead to our obliteration, not just malice. Limiting ASI designs to those that are not malignant is insufficient for human survival, and only focusing on malice may lead people to underemphasize the magnitude of the overall ASI risk. In addition to the risk you are talking about more generally, that our future with AI will eventually be outside our control, a second factor is that Ernst Stavro Blofeld exists and would certainly use any earlier AGI to, as he would put it to ChatGPT-11.2, help him write the villain's plan for his next 007 novel/movie: "I'm Ian Fleming's literary heir, his great-grandniece - and I'm writing my next ..."

On the positive side, kids have been known to take care of an ant farm without applying a magnifying glass to heat it in the sun. Perhaps our trivial but to us purposeful motions will be fun for ASI to watch and will continue to entertain.

[-]Zvi3y20

The reason I included that was so I didn't have to get into arguments about it or have people harp on it, not because I thought you actually needed it. The whole idea is to isolate different objections.

[-]TimK3y10

Perhaps our trivial but to us purposeful motions will be fun for ASI to watch and will continue to entertain.

This is something I see bandied about at different levels of seriousness, sometimes even as full defense of AI x-risk. But why would an AI experience entertainment? That type of experience in humans is caused by a feedback loop between the brain and the body. Without physical biological bodies to interrupt or interfere with a programmatically defined reward function in that way, the reward function maintains at state indefinitely.

[-]ProfessorPublius3y10

"But why would an AI experience entertainment?"

I think it's reasonable to assume that AI would build one logical conclusion on another with exceptional rapidity, relative to slower thinkers. Eventually, and probably soon because of its speed, I expect that AI would hit a dead end where it simply doesn't have the facts to add to its already complete analysis of the information it started with plus the information it has collected. In that situation, many people would want entertainment, so we can speculate that maybe AI would want entertainment too. Generalizing from one example is not anywhere near conclusive, but it provides a plausible scenario.