Since randomly combining human genes was enough to create John von Neumann, and human neurons fire less than 1000 times per second, it seems like there should be a straight (if long) path to building an intelligence as strong as N Von Neumanns working together at M times speedup, for N and M at least 1000. That's super enough.
1) Attempting to formalize OP's arguments
B is able to do it quickly.
Let's say there's an intelligence ordering i. i(1) = A, i(2) = B, i(3) = C, etc.
Let t(a, b) denote the time it takes for b to make a.
Let's say the time it take for i(n) to do anything is less than the time it takes for i(m) to do anything, for n>m. This implies that for any integer p (particularly p>n>m), t(i(p), i(n)) < t(i(p), i(m)).
3) This process, if will be done enough of times, will get to the physical limitations for intelligence.
Aside from questions concerning there being a one dimensional ordering i as opposed to multiple things an intelligence can be good [1], and factors like cost in addition to time, What does disagreeing with this statement look like?
It could be interpreted as postulating that there exist an N such that i(N) is the greatest intelligence that is "compatible with physics".
A sufficient condition t(i(N), i(N-1)) > T, the time left before the universe "dies". But this condition is not necessary.
If we allow t(a, b) to denote not just b constructing a "directly" but also via chains, if that takes less time, then a necessary condition might be if there exists M such that M<=N, and t(i(M), i(1)) > T and i(1) = humans [1].
[1] It is worth noting that a group of humans might be considered more intelligent than a human. Does this mean if i(1) = one human, then a good group of people can be considered i(2)? More generally can the time to make i(n) be reduced by a group consisting of i(a) and i(b) working in concert, where a and b are less than n?
2) Possible Reasons for the "optimism" the OP described:
It is worth noting that the distinction about integers might be incorrect. One could argue that "continuous" improvement is possible, in a sense that makes distinctions like 1.0 and 2.0 meaningless and arbitrary. (If a new version was released every second, then the version number would quickly grow inconveniently large.) While technically a change to a program on a computer might usually be "discrete" one might ask if a) a small change, whether the change of of a neural net weight or a variable in a self modifying program [2], perhaps from 41 to 42, should count as an entirely different program, and b) if computers will always be in the form they are today. We focus on software, but the growth of hardware capability (Moore's Law is usually referenced) has been somewhat important globally, as well as for AI.
Suppose instead of being a process of producing a succession of smarter agents, instead the process was internal for "an agent"[3]. As it improves itself it might be able to learn faster and be smarter. As it becomes smarter/learns more, it might be able to improve itself faster. Some argue for this on the grounds that the incredible intelligent things/groups/etc. we've seen so far "don't optimize their process of optimizing" (all the way up) enabling this kind of growth.
Even if this were true - that recursively self improving AI was powerful and took off fast, the thing about fundamental limits of intelligence might be an unreasonable conclusion. I say, it is probably a pointless one - we care about how powerful AI may be, how fast it may take off, and whether it will be dangerous - and there is no reason to believe AI must achieve the physical limits to be dangerous, to be powerful, and it may improve quickly without getting there soon or ever.
[2] Most programming probably doesn't look like this, or is called something else.
[3] AI might be dangerous or useful in forms other than one big super intelligent agent. There aren't a lot of examples of this, though analogies are sometimes made to the effectiveness/success of teams/groups/corporations today, as opposed to individuals working alone.
3) "Optimism" might not be the best description - not everyone thinks this is (necessarily) a good thing. To that end, the title of the post may be slightly misleading.
I'll quote Slate Star Codex; however, this idea is common and I saw it in many places.
(Source: Meditations On Moloch)
So, there is three assumption here:
1) If A designed a machine (lets call it B) which is smarter than itself, B will be able to design a machine which is smarter that itself.
2) B is able to do it quickly.
3) This process, if will be done enough of times, will get to the physical limitations for intelligence.
The first assumption very questionable; If A designed a smarter machine than itself, the new machine will be able to design a smarter machine than A. Maybe, for B, design a smarter machine than itself is much harder than what it was for A (on the one hand, A is less smart and than design a smarter machine is easier; on the other, A is less smart, so it would be harder to him. But its not means that these things are balanced).
The second assumption looks like, unconsciously, "since B is a machine, its doing things quickly." sometimes its true, but sometimes not. there is a lot of missions which takes a lot of time for computer and design machines not seems to be different.
The third assumption is mathematically incorrect. For example, lets look at the next series:
a(n)=arctan(n)
the first machine intelligence is arctan(1). The second is arctan(2), and so on. Since arctan(x) is increasing function, every machine would be smarter than the previous; but every machine's intelligence would be lesser than pi/2, when the physical limitations for intelligence can be, for example, 42.