People who think that risks from AI is the category of dangers that is most likely to be the cause of a loss of all human value in the universe often argue that artificial general intelligence tends to undergo recursive self-improvement. The reason for doing so is that intelligence is maximally instrumentally useful in the realization of almost any terminal goal an AI might be equipped with. They believe that intelligence is an universal instrumental value. This sounds convincing, so let's accept it as given.
What kind of instrumental value is general intelligence, what is it good for? Personally I try to see general intelligence purely as a potential. It allows an agent to achieve its goals.
The question that is not asked is why an artificial agent would tap the full potential of its general intelligence rather than only use the amount it is "told" to use, where would the incentive to do more come from?
If you deprived a human infant of all its evolutionary drives (e.g. to avoid pain, seek nutrition, status and - later on - sex), would it just grow into an adult that might try to become rich or rule a country? No, it would have no incentive to do so. Even though such a "blank slate" would have the same potential for general intelligence, it wouldn't use it.
Say you came up with the most basic template for general intelligence that works given limited resources. If you wanted to apply this potential to improve your template, would this be a sufficient condition for it to take over the world? I don't think so. If you didn't explicitly told it to do so, why would it?
The crux of the matter is that a goal isn't enough to enable the full potential of general intelligence, you also need to explicitly define how to achieve that goal. General intelligence does not imply recursive self-improvement, just the potential to do so, but not the incentive. The incentive has to be given, it is not implied by general intelligence.
For the same reasons that I don't think that an AGI will be automatically friendly, I don't think that it will automatically undergo recursive self-improvement. Maximizing expected utility is, just like friendliness, something that needs to be explicitly defined, otherwise there will be no incentive to do so.
For example, in what sense would it be wrong for a general intelligence to maximize paperclips in the universe by waiting for them to arise due to random fluctuations out of a state of chaos? It is not inherently stupid to desire that, there is no law of nature that prohibits certain goals.
Why would an generally intelligent artificial agent care about how to reach its goals if the preferred way is undefined? It is not intelligent to do something as quickly or effectively as possible if doing so is not desired. And an artificial agent doesn't desire anything that it isn't made to desire.
There exists an interesting idiom stating that the journey is the reward. Humans know that it takes a journey to reach a goal and that the journey can be a goal in and of itself. For an artificial agent there is no difference between a goal and how to reach it. If you told it to reach Africa but not how, it might as well wait until it reaches Africa by means of continental drift. Would that be stupid? Only for humans, the AI has infinite patience, it just doesn't care about any implicit connotations.
I think a certain sort of urgency comes naturally from the finite lifespan of all things. So let's say we have an AI with a 0.00001% chance of malfunctioning or being destroyed per year, that gets 100 utility if it is in Africa. Should it go now, or should it wait a year? Well, all things being equal, the expected utility of the "go now" strategy is veeeery slightly larger than the expected utility of waiting a year. So an expected utility maximizer with any sort of risk of death would choose to act sooner rather than later.
Yes, a full-blown expected utility maximizer, with a utility-function enclosing goals with detailed enough optimization parameters to make useful utility calculations about real-world causal relationships relative to its own self-perception. I think that before something like that becomes possible, some other less sophisticated intelligence will have already been employed as a tool to do, or solve, something that destroys pretty much everything. An AI that can solve bio or nanotech problems should be much easier to design than one that can destroy the worl... (read more)