Given we survive long enough, we'll find a way to write a self-modifying program that has, or can develop, human-level intelligence.
How can I arrive at the belief that it is possible for an algorithm to improve itself in a way to achieve something sufficiently similar to human-level intelligence? That it is in principle possible is not a question here. But is it possible given limited resources? And if it is possible given limited resources, is it efficient enough to pose an existential risk?
The capacity for self-modification follows from 'artificial human intelligence,' but since we've just seen links to writers ignoring that fact I thought I'd state it explicitly.
Humans can learn, that is far from what is necessary to reach a level above your own, on your own. Also, how do you know that any given level of intelligence is capable of handling its own complexity effectively? Many humans are not capable of handling the complexity of the brain of a worm.
This necessarily gives the AI the potential for greater-than-human intelligence due to our known flaws.
That humans have a hard time to change their flaws might be an actual feature, a trade off between plasticity, efficiency and the necessity of goal-stability.
Given A, the intelligence would improve itself to the point where we could no longer predict its actions in any detail.
I don't think that is a reasonable assumption, see my post here. The short version: I don't think that intelligence can be applied to itself efficiently.
...the AI could escape from any box we put it in. (IIRC this excludes certain forms of encryption, but I see no remotely credible scenario in which we sufficiently encrypt every self-modifying AI forever.)
Well, even humans can persuade their guards to let them out. I agree.
...the AI could wipe out humanity if it 'wanted' to do so.
I think it is unlikely that most AI designs will not hold. I agree with the argument that any AGI that isn't made to care about humans won't care about humans. But I also think that the same argument applies for spatio-temporal scope boundaries and resource limits. Even if the AGI is not told to hold, e.g. compute as many digits of Pi as possible, I consider it an far-fetched assumption that any AGI intrinsically cares to take over the universe as fast as possible to compute as many digits of Pi as possible. Sure, if all of that are presuppositions then it will happen, but I don't see that most of all AGI designs are like that. Most that have the potential for superhuman intelligence, but who are given simple goals, will in my opinion just bob up and down as slowly as possible. This is an antiprediction, not a claim to the contrary. What makes you sure that it will be different?
Humans can learn, that is far from what is necessary to reach a level above your own, on your own.
Yes, you also need the ability to self-modify and the ability to take 20 or fail and keep going. But I just argued that the phrase "on your own" obscures the issue, because if one AGI has a chance to rewrite itself (and does not take over the world) then I see no realistic way to stop another from trying at some point.
Also, how do you know that any given level of intelligence is capable of handling its own complexity effectively?
I don't think ...
Link: overcomingbias.com/2011/07/debating-yudkowsky.html