Steelmanning Inefficiency
When considering writing a hypothetical apostasy or steelmanning an opinion I disagreed with, I looked around for something worthwhile, both for me to write and others to read. Yvain/Scott has already steelmanned Time Cube, which cannot be beaten as an intellectual challenge, but probably didn't teach us much of general use (except in interesting dinner parties). I wanted something hard, but potentially instructive.
So I decided to steelman one of the anti-sacred cows (sacred anti-cows?) of this community, namely inefficiency. It was interesting to find that it was a little easier than I thought; there are a lot of arguments already out there (though they generally don't come out explicitly in favour of "inefficiency"), it was a question of collecting them, stretching them beyond their domains of validity, and adding a few rhetorical tricks.
The strongest argument
Let's start strong: efficiency is the single most dangerous thing in the entire universe. Then we can work down from that:
A superintelligent AI could go out of control and optimise the universe in ways that are contrary to human survival. Some people are very worried about this; you may have encountered them at some point. One big problem seems to be that there is no such thing as a "reduced impact AI": if we give a superintelligent AI a seemingly innocuous goal such as "create more paperclips", then it would turn the entire universe into paperclips. Even if it had a more limited goal such as "create X paperclips", then it would turn the entire universe into redundant paperclips, methods for counting the paperclips it has, or methods for defending the paperclips it has - all because these massive transformations allow it to squeeze just a little bit more expected utility from the universe.
The problem is one of efficiency: of always choosing the maximal outcome. The problem would go away if the AI could be content with almost accomplishing its goal, or of being almost certain that its goal was accomplished. Under those circumstances, "create more paperclips" could be a viable goal. It's only because a self-modifying AI drives towards efficiency, that we have the problem in the first place. If the AI accepted being inefficient in its actions, even a little bit, the world would be much safer.
So the first strike against efficiency is that it's the most likely thing to destroy the world, humanity, and everything of worth and value in the universe. This could possibly give us some pause.
General purpose intelligence: arguing the Orthogonality thesis
Note: informally, the point of this paper is to argue against the instinctive "if the AI were so smart, it would figure out the right morality and everything will be fine." It is targeted mainly at philosophers, not at AI programmers. The paper succeeds if it forces proponents of that position to put forwards positive arguments, rather than just assuming it as the default position. This post is presented as an academic paper, and will hopefully be published, so any comments and advice are welcome, including stylistic ones! Also let me know if I've forgotten you in the acknowledgements.
Abstract: In his paper “The Superintelligent Will”, Nick Bostrom formalised the Orthogonality thesis: the idea that the final goals and intelligence levels of agents are independent of each other. This paper presents arguments for a (slightly narrower) version of the thesis, proceeding through three steps. First it shows that superintelligent agents with essentially arbitrary goals can exist. Then it argues that if humans are capable of building human-level artificial intelligences, we can build them with any goal. Finally it shows that the same result holds for any superintelligent agent we could directly or indirectly build. This result is relevant for arguments about the potential motivations of future agents.
1 The Orthogonality thesis
The Orthogonality thesis, due to Nick Bostrom (Bostrom, 2011), states that:
- Intelligence and final goals are orthogonal axes along which possible agents can freely vary: more or less any level of intelligence could in principle be combined with more or less any final goal.
It is analogous to Hume’s thesis about the independence of reason and morality (Hume, 1739), but applied more narrowly, using the normatively thinner concepts ‘intelligence’ and ‘final goals’ rather than ‘reason’ and ‘morality’.
But even ‘intelligence’, as generally used, has too many connotations. A better term would be efficiency, or instrumental rationality, or the ability to effectively solve problems given limited knowledge and resources (Wang, 2011). Nevertheless, we will be sticking with terminology such as ‘intelligent agent’, ‘artificial intelligence’ or ‘superintelligence’, as they are well established, but using them synonymously with ‘efficient agent’, artificial efficiency’ and ‘superefficient algorithm’. The relevant criteria is whether the agent can effectively achieve its goals in general situations, not whether its inner process matches up with a particular definition of what intelligence is.
= 783df68a0f980790206b9ea87794c5b6)
Subscribe to RSS Feed
= f037147d6e6c911a85753b9abdedda8d)