We're certainly better at reflecting at some parts of our self than others. The ironic thing is, though, that when we look more closely and analyze just what it is that we are not reflecting on very well that we open up the can of worms that we had previously been avoiding.
In the context of the original post - suppose that SGAI is logging some of the internal state into a log file, and then gains access to reading this log file, and reasons about it in same way as it reasons about the world - noticing correlation between it's feelings and state with the log file. Wouldn't that be the kind of reflection that we have? Is SGAI even logically possible without hard-coding some blind spot inside the AI about itself?
If we could we'd probably have gone extinct already.
Or maybe we're going to go extinct real soon now, because we lack ability to reflect like this, and consequently didn't have couple thousands years to develop effective theory of mind for FAI before we make the hardware.
Or maybe we're going to go extinct real soon now, because we lack ability to reflect like this, and consequently didn't have couple thousands years to develop effective theory of mind for FAI before we make the hardware.
Having the ability to design and understand AI for a couple of thousand of years but somehow the inability to actually implement it sounds just about perfect. If only!
Are there any essays anywhere that go in depth about scenarios where AIs become somewhat recursive/general in that they can write functioning code to solve diverse problems, but the AI reflection problem remains unsolved and thus limits the depth of recursion attainable by the AIs? Let's provisionally call such general but reflection-limited AIs semi-general AIs, or SGAIs. SGAIs might be of roughly smart-animal-level intelligence, e.g. have rudimentary communication/negotiation abilities and some level of ability to formulate narrowish plans of the sort that don't leave them susceptible to Pascalian self-destruction or wireheading or the like.
At first blush, this scenario strikes me as Bad; AIs could take over all computers connected to the internet, totally messing stuff up as their goals/subgoals mutate and adapt to circumvent wireheading selection pressures, without being able to reach general intelligence. AIs might or might not cooperate with humans in such a scenario. I imagine any detailed existing literature on this subject would focus on computer security and intelligent computer "viruses"; does such literature exist, anywhere?
I have various questions about this scenario, including: