I've put a preprint up on arXiv that this community might find relevant. It's an argument from over a year ago, so it may be dated. I haven't been keeping up with the field much since I wrote it, so I welcome any feedback especially on where the crux of the AI risk debate has moved since the publication of Bostrom's Superintelligence book.

Don't Fear the Reaper: Refuting Bostrom's Superintelligence Argument

In recent years prominent intellectuals have raised ethical concerns about the consequences of artificial intelligence. One concern is that an autonomous agent might modify itself to become "superintelligent" and, in supremely effective pursuit of poorly specified goals, destroy all of humanity. This paper considers and rejects the possibility of this outcome. We argue that this scenario depends on an agent's ability to rapidly improve its ability to predict its environment through self-modification. Using a Bayesian model of a reasoning agent, we show that there are important limitations to how an agent may improve its predictive ability through self-modification alone. We conclude that concern about this artificial intelligence outcome is misplaced and better directed at policy questions around data access and storage.

As I hope is clear from the argument, the point of the article is to suggest that to the extent AI risk is a problem, we should shift our focus away from AI theory and more towards addressing questions of how we socially organize data collection and retention.

New Comment
20 comments, sorted by Click to highlight new comments since:

Contrary to the conditions of Bostrom’s intelligence explosion scenario, we have identified ways in which the recalcitrance of prediction, an important instrumental reasoning task, is prohibitively high.

To demonstrate how such an analysis could work, we analyzed the recalcitrance of prediction, using a Bayesian model of a predictive agent. We found that the barriers to recursive self-improvement through algorithmic changes is prohibitively high for an intelligence explosion.

No, you didn't. You showed that there exists an upper bound on the amount of improvement that can be had from algorithmic changes in the limit. This is a very different claim. What we care about is what happens within the range close to human intelligence; it doesn't matter that there's a limit on how far recursive self-improvement can go, if that limit is far into the superhuman range. You equivocate between "recursive self-improvement must eventually stop somewhere", which I believe is already widely accepted, and "recursive self-improvement will not happen", which is a subject of significant controversy.

Agreed the quoted "we found" claim overreaches. The paper does have a good point though: the recalcitrance of further improvement can't be modeled as a constant, it necessarily scales with current system capability. Real world exponentials become sigmoids; mold growing in your fridge and a nuclear explosion are both sigmoids that look exponential at first: the difference is a matter of scale.

Really understanding the dynamics of a potential intelligence explosion requires digging deep into the specific details of an AGI design vs the brain in terms of inference/learning capabilities vs compute/energy efficiency, future hardware parameters, etc. Can't show much with vague broad stroke abstractions.

A human brain uses as much power as a lightbulb and its size is limited by the birth canal, yet an evolutionary accident gave us John Von Neumann who was far beyond most people. An AI as smart as 1000 Von Neumanns using the power of 1000 lightbulbs could probably figure out how to get more power. Arguments like yours ain't gonna stop it.

This, and find better ways to optimize power efficiency.

an AI even that was just AS smart as von neumann would still be a very dangerous entity, if it wasn't aligned with humanity's interests.

Do you think John von Neumann was "aligned with humanity's interests"?

Yup. He certainly wasn't aligned against them in any of the blatant ways I would expect unsafe AI to be.

Why would you expect a von Neumann-level AI to be "not aligned" in blatant ways? That's entirely not obvious to me.

There is a whole spectrum of possibilities between paradise and paperclips. von Neumann was somewhere on that spectrum (assuming the notion of "aligned with humanity's interests" makes sense for individuals), a human-level AI will also be somewhere on this spectrum. How sure are you of what's located where?

For the same reason videogame AI often makes mistakes a human player never would.

A videogame AI is constructed with the explicit purpose of losing gracefully to a human. People don't buy games to be crushed by a perfectly playing opponent.

Have you ever played videogames? Sure AI is often programmed to make certain errors, but there are TONS of things that smash your sense of verisimilitude that would be programmed out if it was easy. You can notice this by comparing the behavior of NPCs in newer vs older games. You can also deduce this by other in-game features: EG tutorials that talk about intended ways to defeat the AI but fail to mention the unintended ways, like putting a bucket over their head so they can't "see" you sneaking.

Also: plenty of AI also plays PERFECTLY in ways no human is capable of. I gotta ask again, do you play videogames? A lot of game AI is in fact designed to beat you. There are difficulty settings for a reason! The reason is there are a lot of people who actually like being crushed by a more powerful opponent until they figure out how to beat it.

Have you ever played videogames?

On occasion :-)

AI is often programmed to make certain errors

Not only that. AI is generally not a high-value effort in games: pisses off the customers if it's too good and eats valuable CPU time.

A lot of game AI is in fact designed to beat you.

I disagree. And note that games like Dark Souls are difficult not because the AI is good, but because the player is very very fragile. I would say that game AI is designed to give you a bit of trouble (not too much) and then gracefully lose.

There are difficulty settings for a reason!

Difficulty settings generally don't make AI smarter, they just give it more advantages (more resources, usually) and simultaneously hit the player with penalties.

My main thoughts were already expressed by jimrandomh and dogiv, but I will add a minor point:

Even if we don't consider the objections raised by jimrandomh and dogiv, naming this article Don't Fear the Reaper: Refuting Bostrom's Superintelligence Argument is hyperbole, since the article states (in section 5) that the argument made in the article is "not a decisive argument against the possibility of an intelligence explosion".

Furthermore, the fact that the article recommends "regulators controlling the use of generic computing hardware and data storage" (again in section 5) suggests that the author recognizes that he has not refuted Bostrom's argument.

The attempt to analytically model the recalcitrance of Bayesian inference is an interesting idea, but I'm afraid it leaves out some of the key points. Reasoning is not just repeated applications of Bayes' theorem. If it were, everyone would be equally smart except for processing speed and data availability. Rather, the key element is in coming up with good approximations for P(D|H) when data and memory are severely limited. This skill relies on much more than a fast processor, including things like simple but accurate models of the rest of the world, or knowing the correct algorithms to combine various truths into logical conclusions.

Some of it does fall into the category of having the correct prior beliefs, but they are hardly "accidentally gifted" -- learning the correct priors, either from experience with data or through optimization "in a box" is a critical aspect of becoming intellectually capable. So the recalcitrance of prediction, though it clearly does eventually go to infinity in the absence of new data, is not obviously high. I would add also that for your argument against the intelligence explosion to hold, the recalcitrance of prediction would have to be not just "predictably high" but would need to increase at least linearly with intelligence in the range of interest--a very different claim, and one for which you have given little support.

I do think it's likely that strictly limiting access to data would slow down an intelligence explosion. Bostrom argues that a "hardware overhang" could be exploited for a fast takeoff, but historically, advanced AI projects like AlphaGo or Watson have used state-of-the-art hardware during development, and this seems probable in the future as well. Data overhang, on the other hand, would be nearly impossible to avoid if the budding intelligence is given access to the internet, of which it can process only a small fraction in any reasonable amount of time.

There was discussion about this post on /r/ControlProblem; I agree with these two comments:

If I understood the article correctly, it seems to me that the author is missing the point a bit.

He argues that the explosion has to slow down, but the point is not about superintelligence becoming limitless in a mathematical sense, it's about how far it can actually get before it starts hitting its limits.

Of course, it makes sense that, as the author writes, a rapid increase in intelligence would at some point eventually have to slow down due to approaching some hardware and data acquisition limits which would keep making its improvement process harder and harder. But that seems almost irrelevant if the actual limits turn out to be high enough for the system to evolve far enough.

Bostrom's argument is not that the intelligence explosion, once started, would have to continue indefinitely for it to be dangerous.

Who cares if the intelligence explosion of an AI entity will have to grind to a halt before quite reaching the predictive power of an absolute omniscient god.

If it has just enough hardware and data available during its initial phase of the explosion to figure out how to break out of its sandbox and connect to some more hardware and data over the net, then it might just have enough resources to keep the momentum and sustain its increasingly rapid improvement long enough to become dangerous, and the effects of its recalcitrance increasing sometime further down the road would not matter much to us.

and

I had the same impression.

He presents an argument about improving the various expressions in Bayes' theorem, and arrives at the conclusion that the agent would need to improve its hardware or interact with the outside world in order to lead to a potentially dangerous intelligence explosion. My impression was that everyone had already taken that conclusion for granted.

Also, I wrote a paper some time back that essentially presented the opposite argument; here's the abstract, you may be interested in checking it out:

Two crucial questions in discussions about the risks of artificial superintelligence are: 1) How much more capable could an AI become relative to humans, and 2) how easily could superhuman capability be acquired? To answer these questions, I will consider the literature on human expertise and intelligence, discuss its relevance for AI, and consider how an AI could improve on humans in two major aspects of thought and expertise, namely mental simulation and pattern recognition. I find that although there are very real limits to prediction, it seems like an AI could still substantially improve on human intelligence, possibly even mastering domains which are currently too hard for humans. In practice, the limits of prediction do not seem to pose much of a meaningful upper bound on an AI’s capabilities, nor do we have any nontrivial lower bounds on how much time it might take to achieve a superhuman level of capability. Takeover scenarios with timescales on the order of mere days or weeks seem to remain within the range of plausibility.

The article states (in section 5) that "regulators controlling the use of generic computing hardware and data storage may be more important to determining the future of artificial intelligence than those that design algorithms"

How will this work exactly? Any regulatory attempt to prevent acquisition of enough computing power to create an AI would likely also make it difficult to acquire the computing and storage equipment one might need for a more modest purpose, (e.g. a server farm for an online game or hosted application, the needs of a corporation's IT department, etc.). Government regulation of computing equipment sounds almost as dystopian as the scenario it is intended to prevent.

And, while a government could (perhaps) prevent its citizens from creating AGI superintelligence this way, how would it prevent other governments and entities not under its jurisdiction from doing so? Gone are the days when any one nation has a monopoly on high-end computing resources.

I very strongly agree with the point you're making. I also think that grabbing more compute and data represents a possible path to do the thing yudkowsky thinks you can do in a box, so it's not all bad.

Nice.

I agree that intelligence relies heavily on data. I think that data can also explain why humans are of a different order of magnitude of intelligence than chimps, despite minimal changes in brain structure. We managed to tap into another source of refined data, the explorations in idea space (not just behaviour space that mimicking allows) of our contemporaries and ancestors.

I've not seen much debate recently to be honest. This was a recent book which touches on these considerations .

[-]pcm00

I think your conclusion might be roughly correct, but I'm confused by the way your argument seems to switch between claiming that an intelligence explosion will eventually reach limits, and claiming that recalcitrance will be high when AGI is at human levels of intelligence. Bostrom presumably believes there's more low-hanging fruit than you do.

I wrote an article about structure of self-improvment and all known ways how SI may be self-limiting. I will add you idea in it. I am still polishing the article and will send it to academic journal so I will not publish it openly now, but I had recently a presentation about the topic, and here are the slides: https://www.slideshare.net/avturchin/levels-of-the-selfimprovement-of-the-ai

Basically there are 5 main levels of SI - hardware, learning, changes of architecture, changes of code, changes of goal system and creation of copies of the AI. Each has around 5 sublevels, so an AI could improve on around 30 levels. However on almost each level it will level off.

Anyway even if SI on each level will be modest like 5 times, total SI will be like 10E20, so it is enough to create superintelligence.

Different ways how SI may slowdown also discussed. So the process will be slower and probably level off but even 1000 times improvement is enough to create rather dangerous system.

If anybody interested in the article I could send a draft privately.