I endorse not widely sharing info that could destroy the world (or accelerate us more than it helps ensure aligned AGI). Starting with smaller private discussion seems great to me.
My personal course of action has been to avoid this topic. I have specific thoughts about what current AI is not doing that bars it from becoming AGI, I have specific thoughts about what lines of research are most likely to successfully lead to AGI, I have arguments for these thoughts, and I've decided to keep them to myself. My reasoning is that if I'm right then sharing them marginally accelerates AGI development and if I'm wrong then whatever I say is likely neutral, so it's all downside, however I keep open the option to change my mind if I encounter some safety related work that hinges on these thoughts (and I've hinted at earlier versions of my thoughts in this space as part of previous safety writing to make the case for why I think certain things matter to building aligned AGI).
In particular, it's not enough that they be useful, responsible, competent, and/or well-meaning.
Even if they want to make everyone ponies, there's a decision theory according to which they won't make you regret approaching them with a secret.
Are you an engineer?
I can't say I am one, but I am currently working on research and prototyping and will probably refrain to that until I can prove some of my hypotheses, since I do have access to the tools I need at the moment.
Still, I didn't want this post to only have relevance to my case, as I stated I don't think probability of successs is meaningful. But I am interested in the opinions of the community related to other similar cases.
edit: It's kinda hard to answer your comment since it keeps changing every time I refresh. By "can't say I am one" I mean a "world-class engineer" in the original comment. I do appreciate the change of tone in the final (?) version, though :)
Building a scaled-down version is perfectly safe, provided you limit its compute (which happens by default) and don't put it in the path of any important systems (which also happens by default).
There are a bunch of ways that could go wrong. The most obvious would be somebody else seeing what you were doing and scaling it up, followed by it hacking its way out and putting itself in whatever systems it pleased. But there are others, especially since if it does work, you are going to want to scale it up, which you may or may not be able to do while keeping...
I've been thinking recently and writing a post about potential AGI architecture that seems possible to make with current technology in 3 to 5 years, and even faster if significant effort will be put to that goal.
It is a bold claim, and that architecture very well might not be feasible, but it got me thinking about the memetic hazard of similar posts.
It might very well be true that there is an architecture combining current AI tech in a manner as to create AGI out there; in that case, should we treat it as a memetic hazard? If so, what is the course of action here?
I'm thinking that the best thing to do is to covertly discuss it with the AI Safety crowd, both to understand it's feasibility, and to start working on how to keep this particular architecture aligned (which is a much easier task than aligning something that you don't even know how it will look.)
What are your thoughts on this matter?