Is there work attempting to show that alignment of a superintelligence by humans (as we know them) is impossible in principle; and if not, why isn’t this considered highly plausible? For example, not just in practice but in principle, a colony of ants as we currently understand them biologically, and their colony ecologically, cannot substantively align a human. Why should we not think the same is true of any superintelligence worthy of the name? “Superintelligence" is vague. But even if we minimally define it as an entity with 1,000x the knowledge, speed,...
Someone should give GPT-4 the MMPI-2 (an online version can be cheaply bought here: https://psychtest.net/mmpi-2-test-online/). The test specifically investigates, if I have it right, deceptiveness on the answers along with a whole host of other things. GPT-4 likely isn't conscious, but that doesn't mean it lacks a primitive world-model; and its test results would be interesting. The test is longish: it takes, I think, two hours for a human to complete.
I am wounded.
I have got the faint suspicion that a tone of passive-aggressive condescension isn't optimal here …
But what stops a blue-cloud model from transitioning into a red-cloud model if the blue-cloud model is an AGI like the one hinted at on your slides (self-aware, goal-directed, highly competent)?
If it's impossible in principle to know whether any AI really has qualia, then what's wrong with simply using the Turing test as an ultimate ethical safeguard? We don't know how consciousness works, and possibly we won't ever (e.g., mysterianism might obtain). But certainly we will soon create an AI that passes the Turing test. So seemingly we have good ethical reasons just to assume that any agent that passes the Turing test is sentient — this blanket assumption, even if often unwarranted from the aspect of eternity, will check our egos and thereby help p...
But what is stopping any of those "general, agentic learning systems" in the class "aligned to human values" from going meta — at any time — about its values and picking different values to operate with? Is the hope to align the agent and then constantly monitor it to prevent deviancy? If so, why wouldn't preventing deviancy by monitoring be practically impossible, given that we're dealing with an agent that will supposedly be able to out-calculate us at every step?
Mental Impoverishment
We should be trying to create mentally impoverished AGI, not profoundly knowledgeable AGI — no matter how difficult this is relative to the current approach of starting by feeding our AIs a profound amount of knowledge.
If a healthy five-year-old[1] has GI and qualia and can pass the Turing test, then a necessary condition of GI and qualia and passing the Turing test isn't profound knowledge. A healthy five-year-old does have GI and qualia and can pass the Turing test. So a necessary condition of GI and qualia and passing the Turin...
Yeah, thanks for the reply. When reading mine, don’t read its tone as hostile or overconfident; I’m just too lazy to tone-adjust for aesthetics and have scribbled down my thoughts quickly, so they come off as combative. I really know nothing on the topic of superintelligence and AI.
... (read more)