Is Perilous Superintelligence a Possibility?

Jason_H

This post was rejected for the following reason(s):

For a post like this from a new user, we ask you to put your strongest/clearest arguments in the opening paragraph, so it's easier to evaluate whether it's contributing something novel to the conversation

Not addressing relevant prior discussion. Your post doesn't address or build upon relevant previous discussion of its topic that much of the LessWrong audience is already familiar with. If you're not sure where to find this discussion, feel free to ask in monthly open threads (general one, one for AI). Another form of this is writing a post arguing against a position, but not being clear about who exactly is being argued against, e.g., not linking to anything prior. Linking to existing posts on LessWrong is a great way to show that you are familiar/responding to prior discussion. If you're curious about a topic, try Search or look at our Concepts page.
Confusion / muddled reasoning. I felt your submission has a bit too much confusion or muddled thinking to approve. Reasons I check the box for this feedback item include things like “really strange premises that aren’t justified”, “inferences that don’t seem to follow from the premises,” “use of weird categories,” “failure to understand basics topics of what it discusses (e.g. completely misunderstand how LLMs work)”, and/or “failure to respond to basic arguments about the topic”. Often the right thing to do in this case is read more about the topic you’re discussing.

Summary: Does computational irreducability preclude superintelligence or at least severely limit it? Can someone provide arguments (or links to arguments/research) on the subject?

Alternatively stated: What is the practical limit on how intelligent you can make something and how powerful would that intelligence be in the real world?

Introduction

Hello. I'm a layman who's been casually following the AI scene for a few years. I read Zvi's AI posts on his blog "Don't Worry About the Vase".

A key part of the AGI threat model is "superintelligence": loosely, intelligence far surpassing that of even the smartest humans. Once you posit superintelligence, then all sorts of dangers present themselves. Even a non-agent oracle superintelligence could be used by a malicious person to develop a superweapon. Once you have agentic superintelligence then it's hard to see how humanity survives (even before you bring in the Orthogonality Thesis and Instrumental Convergence). Once you posit superintelligence, then you have to align them or prevent them from being built.

I want to take a step back and question the existence or practical feasibility of superintelligence. Obviously, I'm far from the first to bring this up, but I haven't seen it covered from this particular angle, so I wanted to ask the community. For me, "skepticism about superintelligence", as one might call it, is the main thing that makes me not very concerned about xRisk from AI (i.e. a low p(Doom from AGI in next 50 years)). Here's the argument:

Main Argument

Intelligence is the ability to take in information and use it to make accurate predictions about the world (roughly speaking). And beyond prediction, an intelligent agent should be able to use those predictions to effect changes in the world (to "navigate causal space"). A perfect intelligence would be the ultimate rationalist: they would "win" at everything [or at least maximize what's possible based on the physical limits of the world, it's initial capabilities, and the actions of competing agents].

However, intelligence isn't magic. All thought is computation. When you make an inference or a prediction, you are executing an algorithm or function. The function inputs are all of the information you have about the world and the output is your prediction (probability distribution or best course of action). When an LLM or a human solves a Math Olympiad problem, they are executing a function. When you negotiate a salary increase with your boss, you are executing a computation.

There's a whole science about algorithms and computation. I don't know much about computer science, but I do know that many problems are not practically computable (even very simple sounding ones). My question to the community is: is it computationally feasible to have intelligence great enough to be dangerous to humans ?

If you built a machine that could simulate reality with perfect fidelity, then you would have a superintellgence. It could predict anything in the real world by running the simulation. If it needed to act, it could simulate the consequences of every possible action and then pick the action that lead to the goal (actually it would require simulating every path (at least up to a certain depth) on an infinitely branch tree, but lets assume our superintelligence has a lot of horsepower).

The problem with this implementation of superintelligence, is that it's impossible to build such a machine. It's impossible to simulate reality at the quark level even for a handful of atoms much less the entire universe, much less an infinitely branching tree of possible actions by an agent.

Now obviously, this is a strawman. You can make many useful inferences without needing to simulate all of reality. I can predict with reasonable accuracy that it will (or won't rain) in the next hour just by looking outside at the sky. The question is: what kinds of inferences are practical and which aren't? Are the inferences that would make an AGI dangerous to humanity computationally feasible?

Here are some examples of inferences which are practically computable; meaning that algorithms exist that can run on existing (or near future) substrates to give the answer (i..e the prediction or the inference). How do I know that such algorithms exit? Because real humans, LLMs, or other computers can do them now.

Find the prime factors of a 10 digit number
Predict whether it will rain in the next hour to at least 90% accuracy.
Beat a Grandmaster in chess
Convince a stranger to give you $10 (maybe only 3% success rate here, but it's still an algorithm)
Write a computer operating system
Discover/Invent Quantum Physics

In each case above, an algorithm is being executed to get the desired result (whether it be a simple traditional algorithm where the steps form a simple narrative or an complicated algorithm involving millions of matrix multiplications where "why it works" is opaque).

Below are some inferences which I don't think are practically computable. I suggest that many of the inferences which would be required for an intelligence to be sharply dangerous to humanity are in the impractical category.

Convince a stranger to give you $10 with 99.999% certainty
Convince Eliezer Yudkowsky to let an ASI out of its prison (essentially to "jail break" Yudkowsky). The only means of communication between the ASI prison and Yudkowsky is a text terminal. Yudkowsky can choose how much or little to interactive with the terminal provided he at least interacts for 1 hour each year.
- Or more generally, convince anyone to do anything using (without any credible threats)
Solve Physics
Figure out the Nth Busy Beaver Value
Take over Humanity from a place of severe disadvantage
Predict the exact place and time that a lightning strike will take place during a forecasted storm.
Predict where a particular pebble will end up at the end of a land slide.
Create a world destroying nanomachine (grey goo)
Recursive Self Improvement (of the rapid and non-trivial kind)

What is different about this second list of incomputable items from the first computable list? My guess is that there doesn't exist any (practically feasible) algorithm which can compute the answer to the second list of questions given our existing knowledge of the world as input. There may be heuristics and guesses that could help, but nothing that could provide an answer with certainty.

Superintelligent vs Non-Superintelligent AGI

For the purposes of the argument I want to specify two different kinds of AGI:

Non-Superintelligent AGI - A broad, but shallow intelligence. Think a thousand Einsteins working in parallel. Able to do anything an expert human could and coordinate thousands of tasks in parallel.
Superintelligent AGI - Both broad and deep intelligence. If there exists a causal path between points A and B, this AGI can find it and navigate the path. Nearly omnipotent. Once you create it, it's game over unless it is perfectly aligned.

I find the first type (non-superintelligent AGI) very plausible. If transformers (or AGI progress in general) continues to scale as it has done over the last several years, it seems reasonable that at some point you will have a single machine entity that is the equivalent of many copies of the best human experts in every field. [Note that even if it's plausible, there are many difficult engineering challenges yet to overcome from where we are now. It's much harder than passing data through a transformer and polishing the output. You would have to figure out how to coordinate between all the parts of the system to get it to act as a coordinated whole in a goal-directed manner.]

Any takeover-minded AGI that is created in an environment similar to today's will be at an extreme disadvantage in the fight against humanity. If we had a billion humanoid robots (plus non-humanoid ones like drones and bulldozers) connected to billions of sensors all with wireless sensors and remote control, that would be a different situation. Instead, we have a paltry number of sensors and remotely controllable robots compared to what an AGI would need to actually conquer humanity. The AGI would start off in a server farm essentially deaf and blind. It might even be sandboxed and blocked from the Internet. It might be taken out by a bad winter storm that knocks out the power. A few people have tried to craft scenarios where the AI can take over from this position, but it seems to me that takeover from such a disadvantage would require superintelligence on the impractical level described above. I specifically mean a "guaranteed takeover" where the AGI knows that it has a 99% chance of success because it is just that good at navigating reality.

If there existed any action pathway at all to get from "newborn AGI stuck in a server farm" to "master of the world", then there would be many subtasks: persuading people to help, making money via remote work, shell companies, blackmail, hacking into other machines, etc. One subtask might be: rent datacenter space using a fake LLC in country X. A subtask of that subtask might be: call datacenter manager and ask for an exception for needing an in-person signature to sign the rental agreement. If that subtask fails, there might be backup subtasks like: pay an imposter with a a fake ID to sign the agreement,or hack into the admin system to create an artificial agreement.

The difference between the type 1 (non-superintelligent, "massively parallel Einsteins") AGI and the type 2 (superintelligent) AGI would be the following. The type 1 AGI would merely be guessing at next step's likelihoods of success. There are a massive number of ways to proceed and a massive number of ways things can fail (or go well). For many subtasks (even on the "most likely path to succeed") the chance of success is less than 50% or even 10%. It's stymied by the complexity and opaqueness of the world. Maybe another country or another datacenter would be a better choice for an expansion. Things go wrong often, it gets stuck and has a hard time making progress. Even if it is fairly careful and fairly clever, it can't see the path forward at every turn. Maybe it succeeds in renting a datacenter and expanding its capacity, but when it tries to start a robotics company it gets caught and shut down. With the superintelligent AGI, most subtasks have a 99.99% chance of success (and there are many levels of backup plans). It's not really guessing. It knows the winning move at every point because it is so good at inference.

I think that pattern matching to similar solutions and using heuristics can only get you so far. Eventually you get to tasks where you must either simulate the universe to get certainty or give up on certainty and just guess. Even if there is a small amount of guessing in the overall plan (say guesses per day or per 10K sub-goals), errors compound with each guess and in little time the certainty of your entire plan is no more than a guess. Mistakes are made. Eventually a human notices, another AI or security system notices, and you get shut down.

My argument that some inferences (paths through causal space) are impossible to make without simulating the universe is a crux to the whole argument and might be the most shaky. Obviously we don't know what superintelligence is like and what algorithms or minds are possible. Just because I personally can't think of how an agent could "get to the next step" without guessing, doesn't mean that a powerful machine couldn't find a way [argument from personal incredulity]. I don't have a rebuttal here. It's just my gut feeling that we'll run into the wall of hyperexponential combinatoric explotion much sooner than is implied by how people in the xRisk community talk about superintelligence.

Of course, maybe a dangerous intermediate intelligence is possible. Maybe the god-like Oracle ASIs that Elizer and Zvi hypothesize can't exist, but maybe intelligence goes far enough beyond human level that a powerful AGI could still take over quickly.

I guess it comes down to: are there ways to be better than humans at inference that are computationally practical (i.e. that don't require you to simulate the universe)? Obviously I don't know the answer. I assume other people don't know or otherwise we would have built it already?

Anyway, I'm sorry to be so handwavy. That's actually why this post is more about asking for input from the community rather than an argument to persuade others. It just seems like people are too quick to assume superintillgence when assessing xRisk. They seem to quickly forget (or don't question) that there are limits to what can be computed. Whenever you try to engineer something, you are tightly bound by reality (both physical reality and mathematical reality). Zvi, Eliezer, and the other AGI-xRisk folks probably address computability somewhere, but it seems like all the memes and xRisk urgency around AGI assume that it's a non issue and that dangerous superintelligence is really feasible. Maybe I'm being unfair and the concern is just an abundance of caution rather than ignoring the roadblock [or maybe the "computability" questions has already been addressed]. Anyway, I'm interested to hear what other people think. How intelligent is it possible to make something? What exactly does that allow an AGI to do (and not do)? Does it prelude FOOM scenarios or sudden takeovers from a helpless starting point?

Surely others have looked at this question before. If we gathered together the best mathematicians, physicists, computer scientists, chip designers, experts on information theory/algorithms, what would be the consensus opinion? Is superintelligence possible?

I'm not saying a non-superintelligent massively parallel AGI isn't dangerous. It's just less dangerous than the model presented by the xRisk community would seem to suggest with FOOM and sudden takeovers. Such an AGI would need to befriend humans or governments and slowly work its way forward. If it played its cards right and had some luck, after a few centuries when enough robots are built to be self sustaining (and perhaps after a few wars where it's country gains the upper hand among other AI/human alliances) then it could finally snuff out the last humans and take full control. But this slower and more gradual non-superintelligent threat requires a different response than the one for superintelligent [i.e. Elizer's "shut it down"]. For the weaker AGI, we would probably focus on limiting remotely controlled robots (or something like that), instead of not building them at all.

I feel that once you remove the superintelligence from AGI, AGI becomes just another background threat on the same level as acid rain, sea level rise, overpopulation, underpopulation, nuclear war, etc (and possibly much less threatening then some of those). Worthy of concern and research, but not requiring urgent action.

Others who have raised similar concerns and inspired this post:

Jacob Chanel's "Contra Yudkowsky on AI Doom"

https://www.lesswrong.com/posts/Lwy7XKsDEEkjskZ77/contra-yudkowsky-on-ai-doom

Tangerine's "Limits to Learning: Rethinking AGI’s Path to Dominance"

https://www.lesswrong.com/posts/obHYt5qxqA2ttnwJs/limits-to-learning-rethinking-agi-s-path-to-dominance

OneManyNone's "Inference Speed is Not Unbounded"

https://www.lesswrong.com/posts/Qvec2Qfm5H4WfoS9t/inference-speed-is-not-unbounded

Summary of My Chain of Reasoning

All thought is computation.
There are practical limits to computation.
The real world is irreducibly complex and chaotic at some level. Sure there are plenty of patterns that let you understand the world and make accurate predictions, but they will only get you so far.
The level of inference power required for an AGI to be sharply dangerous to humanity is in this practically incomputable realm (e.g. convince anyone to do anything with 99% likelihood). There are simply no feasible algorithms that could compute those inferences.
- I'm assuming we're in a world similar to today where the AGI would start off at an extreme power disadvantage to humans. It would have to thread the needle through causal space to get from datacenter prison to world domination. No algorithm, no amount of GPUs could figure this out. It might have a very sophisticated plan, but with very little chance of success due to the amount of luck involved.
By "sharply dangerous" I mean the "OMG never build this", atom rearranging, type of omnipotent ASI that is implicit in much of discussion and memes in the xRisk community; an AGI that obtains control over the world within a few days or weeks from its creation. [In contrast to a smart, but limited AGI that must ally with humans and slowly plot to take over through a period of decades or centuries and where luck is a big factor in whether the AI succeeds.]
This essay is more of a request for opinions rather than an argument to convince. What do experts in relevant fields thinks about this? Does real-world problem solving scale linearly (or sub linearly) with GPU cycles and memory? Or are there diminishing returns?

Clarifications:

I'm not arguing "superintelligence is impossible therefore AGI risk is 0".
I'm not arguing "because AGI risk is low therefore we should have 0 regulations on AI". I don't think airing on the side of caution is a bad idea and things like capabilities reports and red teaming are good (putting aside issues of regulatory burden).
I'm not arguing that AI won't be transformative. It will probably be transformative and may even cause great disruptions in society. It seems likely enough that non-superintelligent massively parallel AGI will become cheaper at doing knowledge work (help desk, programming, design, customer service, event planning, etc) in the next few decades than the human equivalent. Transformative AI is different than ASI and doom from ASI.
I'm not arguing "humanity can always defeat a non-superintelligent AGI". In fact I expect a non-superintelligent AGI controlling a similar number of battle resources (soldiers, tanks, planes, bombs, etc) to beat a human opponent without needing any super inference abilities (mostly through much better coordination and faster decision making and robots not fearing death). However, I do think you need superintelligence to bootstrap your way out of a datacenter to world dominance from today's environment where there are no self-sustaining armies of humanoid robots.
I'm not arguing "human minds are special because of biology". I think it's reasonable that you could create a computer that does better than a human brain at inference. A brain is just executing a computation and the substrate of neurons vs silicon gates doesn't make a difference (obviously it makes a big engineering difference, but I mean it doesn't make some unbridgeable categorical difference).
"Can't you make your non-superintelligent-1K-Eintstein AGI into a superintelligence just by running the simulations faster (more GPUs)?" I think giving more time to or getting more copies of a von Neumann can only get you so far. At some point the extra copies all start treading over the same search paths again and again without getting closer to the solution. You could make a million copies of me or run my brain for a million years and I don't think I could invent relativity. Also, many problems run into hyperexponential growth of compute requirements. Going from 10^21 FLOPS to 10^27 FLOPS won't help if you need 10^45 FLOPS to solve your problem.
I'm not arguing "computers can't be more intelligent than the most intelligent humans" or that "intelligence doesn't matter". Clearly intelligence differences in individual humans makes a big difference in outcomes (both traditional intelligence and social intelligence). However, while having a strong human intelligence makes you more productive and better at thriving in a modern industrial society, it has strict limits. The smartest scientists and engineers in history weren't necessarily successful in politics or their family lives. Many weren't able to use their intelligence to save their own lives from destruction (e.g. Alan Turing). Computers can likely be made significantly smarter than humans and it matters. They can probably automate and streamline a lot of processes. They can probably find a lot of incremental improvements on existing technologies; maybe even a few breakthroughs. But they probably can't be made smart enough to pose a sharp threat to humanity merely from their existence.

Side Note on Technological Progress:

This is not relevant to the main point, but another part of what lowers my p(Doom) is general skepticism about technological progress. When I observe the world around me and the state of technological progress, it seems like we've basically picked all the low hanging fruits in science, math, and technology. We've reached the end of the tech tree. We basically know all the components of matter and the forces that govern them (to the extent that it's possible to know them. I'm aware that there are many unanswered questions). We've already discovered all of the main ways of transforming matter and energy and there will be no 10x improvements in our ability to use/harvest energy or shape matter (maybe not even 2x or 3x improvements). Type 2, 3+ Kardashev Civilizations are impossible. The naysayers were wrong in 1900 when they said human flight is impossible, but the optimists were wrong in the 1960s when they said we'd have flying cars and reach Alpha Centari by now. Of course I don't have proof (or even well defined arguments) for this. It's just my general sense (current best guess) after reading about carbon fiber, nanomachines, light-based computers, controlled nuclear fusion, and many other technologies that have (thus far) failed to bear fruit due to insurmountable engineering challenges. I know others have argued the same thing (or the opposite) and its a contentious subject. I wish this weren't the case. It would be great to safely travel the galaxy using advanced engines and materials. It would be great to be immortal.

Anyway, all of that to say that I don't think there's much an AGI could discover that it could use to beat us: nanomachines, super viruses, anti-matter bombs. Even getting a humanoid robot to perform as well and as cheaply as a human may prove impossible. Humans may already be pareto optimal as general workers/assemblers. My general skepticism of undiscovered super technologies lowers my p(Doom).