Epistemic Status: I don't particularly endorse this argument. Just curious to see whether people have any interesting or novel counterarguments.
- Suppose "Solve the Alignment Problem" is in the set of problems that an intelligent, motivated, conscientious group of human beings can potentially solve.
- then, it is in the set of problems that a Bureaucracy Of AIs can solve
- Now suppose it isn't
- "Let us eat and drink, for tomorrow we die"
What if there are some groups of (intelligent, motivated, conscientious, etc.) humans that could solve the alignment problem but other groups cannot? As I understand your idea is to make the AIs mimic one of the groups in order to scale up their progress. But maybe the group you mimic is one of the ones that can't solve it, while other groups can.
Also I don't find it obvious that an AI that is trained to mimic the actions of a human will have the same consequences as that human will, because the AI might not deploy those actions in a strategically appropriate way.