I would like to be given pointers to prior art on the following question: how do we communicate about potential risks from AI in a net-positive way, taking into account the negative that is giving people ideas? I can easily see how someone describing steps an AI takeover could happen influences someone who is malicious or just cares less about safety. This directly relates to the research I intend to pursue, relating specific capabilities AI systems do have or can acquire and the risk factors associated with them.
As soon as someone managed to turn ChatGPT into an agent (AutoGPT), someone created an agent, ChaosGPT, with the explicit goal to destroy humankind. This is the kind of person that might benefit from having what I intend to produce: an overview of AI capabilities required to end the world, how far along we are in obtaining them, and so on. I want this information to be used to prevent an existential catastrophe, not precipitate it.