[Cross-posted from the EA Forum. The EA Forum version of this post is for both half-baked EA ideas and half-baked AI Safety ideas, whereas this version of the post is for half-baked AI Safety ideas specifically.]
I keep having ideas related to AI safety, but I keep not having enough time available to really think through those ideas, let alone try to implement them. Practically, the alternatives for me are to either post something half-baked, or to not post at all. I don't want to spam the group with half-thought-through posts, but I also want to post these ideas, even in their current state, in case some of them do have merit and the post inspires someone to take up those ideas.
Originally I was going to start writing up some of these ideas in my Shortform, but I figured that if I have this dilemma then likely other people do as well. So to encourage others to at least post their half-baked ideas somewhere, I am putting up this post as a place where other people can post their own ideas without worrying about making sure they formulate those ideas to the point where they'd merit their own post.
If you have several ideas, please post them in separate comments so that people can consider each of them individually. Unless of course they're closely related to each other, in which case it might be best to post them together - use your best judgment.
[This post was also inspired by a suggestion from Zvi to create something similar to my AGI Safety FAQ / all-dumb-questions-allowed thread, but for ideas / potentially dumb solutions rather than questions.]
Mental Impoverishment
We should be trying to create mentally impoverished AGI, not profoundly knowledgeable AGI — no matter how difficult this is relative to the current approach of starting by feeding our AIs a profound amount of knowledge.
If a healthy five-year-old[1] has GI and qualia and can pass the Turing test, then a necessary condition of GI and qualia and passing the Turing test isn't profound knowledge. A healthy five-year-old does have GI and qualia and can pass the Turing test. So a necessary condition of GI and qualia and passing the Turing test isn't profound knowledge.
If GI and qualia and the ability to pass the Turing test don't require profound knowledge in order to arise in a biological system, then GI and qualia and the ability to pass the Turing test don't require profound knowledge in order to arise in a synthetic material [this premise seems to follow from the plausible assumption of substrate-independence]. GI and qualia and the ability to pass the Turing test don't require profound knowledge in order to arise in a biological system. So GI and qualia and the ability to pass the Turing test don't require profound knowledge in order to arise in a synthetic material.
A GI with qualia and the ability to pass the Turing test which arises in a synthetic material and doesn't have profound knowledge is much less dangerous than a GI with qualia and the ability to pass the Turing test which arises in a synthetic material and does have profound knowledge. (This also seems to be true of [1] a GI without qualia and the inability to pass the Turing test which arises in a synthetic material and does not have profound knowledge; and of [2] a GI without qualia and the ability to pass the Turing test which arises in a synthetic material and doesn't have profound knowledge.)
So we ought to be trying to create either (A) a synthetic-housed GI that can pass the Turing test without qualia and without profound knowledge, or (B) a synthetic-housed GI that can pass the Turing test with qualia and without profound knowledge.
Either of these paths — the creation of (A) or (B) — is preferable to our current path, no matter how long they delay the arrival of AGI. In other words, it is preferable that we create AGI in 100,000 years than that we create AGI in 20 if creating AGI in 20 means humanity's loss of dominance or its destruction.
My arguable assumption is that what makes a five-year-old generally less dangerous than, say, an adult Einstein is a relatively profound lack of knowledge (even physical know-how seems to be a form of knowledge). All other things being equal, if a five-year-old has the knowledge of how to create a pipe bomb, he is just as dangerous as an adult Einstein with the same knowledge, if "knowledge" means something like "accessible complete understanding of x."