We seem to be getting into some potentially very important territory, and I would certainly like to continue this discussion, but I'm running out of time for now and may be busy for up to 24 hours.
Before I go though, I should say at least one thing. It's certainly not an obvious error, and I could well be the one who's wrong. The discussions about rationality on Less Wrong are extremely useful for a basic reason: it's an extremely difficult and intricate epistemic journey to compensate for our mal-adapted hardware and software, and LW does it better than any other place at the moment (as far as I can see).
So yeah, your questions are certainly important, and they perhaps get to the essence of the issue. I look forward to trying to answer those questions, and seeing where it leads us in the discussion (assuming you think this is useful too). Feel free to write anything else in the meantime, or not.
Are there any essays anywhere that go in depth about scenarios where AIs become somewhat recursive/general in that they can write functioning code to solve diverse problems, but the AI reflection problem remains unsolved and thus limits the depth of recursion attainable by the AIs? Let's provisionally call such general but reflection-limited AIs semi-general AIs, or SGAIs. SGAIs might be of roughly smart-animal-level intelligence, e.g. have rudimentary communication/negotiation abilities and some level of ability to formulate narrowish plans of the sort that don't leave them susceptible to Pascalian self-destruction or wireheading or the like.
At first blush, this scenario strikes me as Bad; AIs could take over all computers connected to the internet, totally messing stuff up as their goals/subgoals mutate and adapt to circumvent wireheading selection pressures, without being able to reach general intelligence. AIs might or might not cooperate with humans in such a scenario. I imagine any detailed existing literature on this subject would focus on computer security and intelligent computer "viruses"; does such literature exist, anywhere?
I have various questions about this scenario, including: