Nothing can be alllll that dangerous if it's known to literally everyone how it works (this is probably the crux where I differ wildly from the larger alignment community, if I had to guess)
I agree that this is the crux, which makes it odd that you're not presenting any argument for it. Do you think it's always true in every possible world, or that it just happens to be true for every realistic example you can think of?
See my response to StartAtTheEnd. I think it's possible in every world where an infohazard has a rate-limiting step that can be effectively regulated. Since every process has some rate-limiting step, and you can regulate anything if you really want to, it seems very unlikely to me that there's some magical technology which will destroy humanity but which cannot be regulated.
If it’s “very unlikely” rather than “impossible even in principle”, then think you should have entitled your post: “There are a lot of things that people say are infohazards, but actually aren’t.” And then you can go through examples. Like maybe you can write sentences like:
And then we could talk about the different examples and whether the consequences of publishing it would be net good or bad. (Obviously, I’m on the “bad” side.)
I guess maybe my argument boils down to, the only thing that can stop a bad person with an infohazard is a larger, more paranoid, and more deadly coalition of state-level actors who also have the infohazard.
I wish you would engage with specific cases, instead of speaking in generalities. Suppose a domain expert figures out an easy way to make deadly novel pandemics using only widely-available equipment. You would want them to first disclose it to their local government (who exactly?), then publish it on their blog, right? How soon? The next day? With nice clear illustrated explanations of each step? What if they take out a billboard in Times Square directing people to the blog post? Is that praiseworthy? If not, why not? What do you expect to happen if they publish and publicize that blog post, and what do you expect to happen if they don't?
OK, let me engage with this specific case. I think the domain expert has to follow the standard procedure for disclosing a security vulnerability. This means notifying the relevant authorities (for a pandemic, the CDC, I suppose) through secure channels, then waiting for permission from the relevant authority to disclose, with some optional cap on the length of time the relevant authority can delay the process in the case that the infohazard would provide an offensive capability that the researcher feels should not be taken advantage of by their local government, or at least, not taken advantage of "too much", whatever that means to the researcher and the local government.
In the case of AI, I have no idea which US government agency considers itself responsible for dealing with AI risks, but I have to assume that the NSA is a good guess.
The other wrinkle when dealing with AI and the NSA is that it is extremely unclear what constitutes a "secure channel".
OK. So I contact the CDC. They say "if any crazy ideologue in the world were able to easily make and release a deadly novel pandemic, that would obviously be catastrophic. Defense against deadly novel pandemics is a very hard problem that we've been working on for decades and will not solve anytime soon. Did you know that COVID actually just happened and pandemic prevention funding is still woefully inadequate? And your step-by-step instructions only involve widely-available equipment, there's no targeted regulation for us to use to stop people from doing this. So anyway, publishing and publicizing that blog post would be one of the worst things that anyone has ever done and you personally and all your loved ones would probably be dead shortly thereafter."
And then I think to myself: "Am I concerned that the CDC itself will make and release deadly novel pandemics"? And my answer is "no".
And then I think to myself: "What would be a reasonable time cap before I publish the blog post and take out the Times Square billboard pointing people to it?" And my answer is "Infinity; why on earth would I ever want to do that? WTF?"
So I guess you'd say: Shame on me, I'm an infohoarder!!!! Right?
If that's your opinion, then what would you do differently? How soon would you publish the blog post, and why?
(Again I suggest the podcast that I linked in my earlier comment.)
The industry standard for disclosing security vulnerabilities is 90 days, so that would be my anchor point if the CDC refused to engage with me in serious dialogue and good faith on the question of pandemic prevention. If the CDC and its analogues want to continue to maintain good relations with the mad scientists of the world, they'll have to engage with crazy ideologues with cool tech specs who contact them through secure channels about matters of importance.
If the CDC did engage with me in serious dialogue and good faith, I would wait as long as they told me to wait, unless they seemed to be dragging their feet in order to obtain an unwarranted first mover advantage. Unwarranted here refers to the "too much" criterion outlined earlier, which is an inherently subjective conversation between the infopandora researcher and the relevant local government institutions.
So I would wait 90 days, plus or minus a few decades, depending on what my conscience told me over that period.
What is the “first mover advantage”? Are you worried about the CDC itself creating and releasing deadly novel global pandemics? To me, that seems like a crazy thing to be worried about. Nobody thinks that creating and releasing deadly novel global pandemics is a good idea, except from crazy ideologues like Seiichi Endo. Regrettably, crazy ideologues do exist. But they probably don’t exist among CDC employees.
I would expect the CDC to “engage with me in serious dialogue and good faith”. More specifically, I expect that I would show them the instructions, and they would say “Oh crap, that sucks. There’s nothing to do about that, except try to delay the dissemination of that information as long as possible, and meanwhile try to solve the general problem of pandemic prevention and mitigation. …Which we have already been trying to solve for decades. We’re making incremental progress but we sure aren’t going to finish anytime soon. If you want to work on the general problem of pandemic prevention and mitigation, that’s great! Go start a lab and apply for NIH grants. Go lobby politicians for more pandemic prevention funding. Go set up wastewater monitoring and invent better rapid diagnostics. Etc. etc. There’s plenty of work to do, and we need all the help we can get, God knows. And tell all your friends to work on pandemic prevention too.”
If the CDC says that, and then goes back to continuing the pandemic prevention projects that they were already working on, would you still advocate my publishing the blog post after 90 days? Can you spell out exactly what bad consequences you expect if I don’t publish it?
I'm mostly on the same page there, although in the case of AI I do worry about the NSA or whoever getting a first-mover advantage in terms of manufacturing consent. (This last concept being from Noam Chomsky's book of the same name, if you or other readers are unfamiliar with the term.)
Is that a thing to be terrified about, or a thing to celebrate and advocate? I'm deeply unsure. I think America is mostly better than its competitors in terms of the kind of craziness it puts out into the world and the kind of atrocities it commits and supports. I.e., a mix of good and bad crazy, and a relatively limited but still way-too-high number of atrocities.
What bad consequences can we expect from an unbalanced ability to manufacture consent? The same consequences Chomsky was pointing to in his original book - deep and unwavering public support for things that are undeniably atrocities, like (in Chomsky's case) the various horrible crimes the US committed in Vietnam, such as the use of Agent Orange, or the My Lai massacre.
These are the predictable and nigh-inevitable consequences I foresee from allowing a government monopoly on the use of strong large language models. We are, of course, in no danger of having a literal government monopoly, but we're in significant danger of having all capable large language models in the hands of either governments or large profit-driven corporations, which still seems like it could generate atrocities if we allow it to.
So, again, I converge to "90 days plus or minus a few decades, as my conscience and my desire to stay out of jail dictate".
I'm still talking about the pandemic thing, not AI. If we're "mostly on the same page" that publishing and publicizing the blog post (with a step-by-step recipe for making a deadly novel pandemic virus) is a bad idea, then I think you should edit your post, right?
Unfortunately, I lack the necessary karma to edit this post (or post a new one) for the next five days. I feel like I stand by what I wrote, as defended and clarified in the comments. My planned edits would mostly just be a discussion of the standard responsible disclosure policy that computer security researchers follow, and how I think it should be adapted in the light of anticipated planetary-scale impact.
So it seems like I've done what I am able to do for the moment. If you really think it's setting a bad example or fundamentally misguided about something important, I would write your own post outlining points of agreement and disagreement, or I would wait five days for my karma to reset.
Thank you for your insightful comments, questions, and pushback. You've helped me clarify my internal models and reframe them into much more standard, sane-sounding arguments.
OK, I'll add the stuff about responsible disclosure to the post. I agree that it's a very important omission from my internal mental model of things.
And, to maybe put too fine a point on it, literal DIY viral gain-of-function research is well within the grasp of crazy ideologues and always has been.
Don't wash your hands after taking a dump. Don't wear a mask to large public gatherings during COVID season. Don't vaccinate your children against dangerous/deadly diseases that were previously reduced to nuisance levels. Have unprotected sex with people you just met. Eat random animals that have enough homologous proteins to act as animal reservoirs for nasty diseases, while making sure to confine those animal in horrible overcrowded conditions that make sure that they do act as animal reservoirs. Overuse antibiotics literally every chance you get.
Is the CDC supposed to suppress this information? Of course not, their whole mission in life is to spread all the information in the above paragraph to anybody who will listen to it. And, at least in the case of COVID, they are doing and have done a piss-poor job of it. Should the rationalist community have suppressed all knowledge of COVID between December 2019 and March 2020? Or is it a badge of honor for the community that they/we were the only major intellectual community to react with the appropriate level of paranoia and infopandoraing that I am advocating in this post and comments?
Your arguments here seem to be:
It is already possible for an individual to do something that has a 0.0000...0000001% chance of causing a deadly global pandemic after 6 months of work. Therefore, what's the harm in disseminating information about a procedure that has a 90% chance of causing a deadly global pandemic after 6 months of work?
If there is already a pandemic spreading around the world, it's good to publicly talk about it. Therefore, it is also good to publicly talk about how, step by step, an individual could create a deadly novel pandemic using widely-available equipment.
Do you endorse those? If those aren't the argument you were making in that comment, then can you clarify?
I endorse (I think?) a modified version of that argument. If it is possible for an individual to do something that has a 50% chance of causing a deadly global pandemic after 6 months of work, in their subjective and limited view of what is possible, and they are 99% sure that they aren't losing their damn marbles, and the relevant global pandemic is already in full swing, then they can and should disseminate the information, after a responsible disclosure process with their local government to make sure that the 90% chance can be prevented by timely regulation and, if necessary, legal and military interventions.
I think you’re misunderstanding something very basic about infectious disease, or else we’re miscommunicating somehow.
You wrote “…has a 50% chance of causing a deadly global pandemic after 6 months of work…”, and you also wrote “…and the relevant global pandemic is already in full swing…”. Those are contradictory.
If virus X is already uncontrollably spreading around the world, then I don’t care about someone knowing how to manufacture virus X, and nobody else cares either. That’s not the problem. I care about somebody knowing how to take lab equipment and manufacture a new different virus Y, such that immunity to X (or to any currently-circulating virus) does not confer immunity to Y. I keep saying “novel pandemic”. The definition of a “novel pandemic” is that nobody is immune to it (and typically, also, we don’t already have vaccines). COVID is not much worse than seasonal flu once people have already caught it once (or been vaccinated), but the spread of COVID was a catastrophe because it was novel—everyone was catching it for the first time, and there were no vaccines yet. And it’s possible for novel pandemics to be much much worse than COVID.
If somebody synthesizes and releases novel viruses A,B,C,D,E,F, each of which is highly infectious and has 70% morality rate, then we have to invent and test and manufacture and distribute six brand new vaccines in parallel, everybody needs to get six shots, the default expectation is that you’re going to get deathly ill six times rather than once, etc. You understand this, right?
The other reason that I think you have some basic confusion is your earlier comment that basically said:
To me, going from the first bullet point to the second one is such a flagrant non sequitur that it makes my head spin to imagine what you could possibly be thinking. So again, I think there’s some very basic confusion here about what I’m talking about and/or how infectious disease works.
I think we're kind of talking past each other, and it's a matter of emphasis.
The details for how to do gain-of-function research have been reasonably public for decades. Biology class exists, and is in fact mandatory to graduate high school. Coursera has free biology classes for some subjects and cheap ones for others. I don't see how any of that is meaningfully different from taking out a billboard in Times Square and telling people that if they study the relevant literature real hard, they too can create deadly pandemics in their basement with kidnapped feral cats.
Lest you think I'm being flippant, I asked GPT-4 to describe the 2014 pause in gain-of-function research, and this is what it said:
"""
The pause in gain-of-function (GOF) research in biology, particularly in the United States, refers to a moratorium that was implemented in October 2014. This moratorium was a response to public concerns about the safety and security risks associated with such research. Gain-of-function research involves manipulating viruses or other organisms to increase their capabilities, like making a virus more transmissible or deadly, often to understand more about disease pathways and to develop treatments or vaccines.
Here's an overview of how the pause came about:
1. **Rising Concerns**: Prior to 2014, there had been increasing public and scientific debate over the risks and benefits of gain-of-function research. High-profile experiments, such as those involving H5N1 influenza (bird flu) that made it transmissible in mammals, raised alarm about the potential for accidental or deliberate misuse of these pathogens.
2. **The U.S. Government's Decision**: In response to these concerns, the U.S. government announced a funding pause for any new gain-of-function research that could make influenza, MERS, or SARS viruses more virulent or transmissible. This decision was made to allow time for a thorough assessment of the risks and benefits of this type of research.
3. **Deliberations and Recommendations**: During the moratorium, the National Science Advisory Board for Biosecurity (NSABB) and other bodies were tasked with evaluating the risks and developing a framework for future GOF research. This process involved extensive consultation with researchers, bioethicists, and the public.
4. **Lifting the Pause**: The pause was lifted in December 2017, following the establishment of the HHS P3CO (Potential Pandemic Pathogen Care and Oversight) Framework. This framework set out new guidelines and oversight mechanisms for gain-of-function research, aimed at minimizing risks while allowing important scientific research to continue.
The pause was significant as it highlighted the complex ethical, safety, and security issues surrounding gain-of-function research. It also underscored the need for robust regulatory frameworks to balance scientific advancement with risk management.
"""
To me, this sounds a lot more like my infopandora story than the typical rationalist infohoarder story about how one deals with infohazards. There was a clear danger, the public was alerted, the public was unhappy, changes were made, research was directed into narrower, safer channels, and society went back to doing its thing.
if they study the relevant literature real hard, they too can create deadly pandemics in their basement with kidnapped feral cats.
Here’s a question:
Question A: Suppose that a person really really wants to create a novel strain of measles that is bio-engineered to be resistant to the current measles vaccine. This person has high but not extreme intelligence and conscientiousness, and has a high school biology education, and has 6 months to spend, and has a budget of $10,000, and has access to typical community college biology laboratory equipment. What’s the probability that they succeed?
I feel extremely strongly that the answer right now is “≈0%”. That’s based for example on this podcast interview with one of the world experts on those kinds of questions.
What do you think the answer to Question A is?
If you agree with me that the answer right now is “≈0%”, then I have a follow-up question:
Question B: Suppose I give you a magic wand. If you wave the wand, it will instantaneously change the answer to Question A to be “90%”. Would you wave that wand or not?
(My answer is “obviously no”.)
There was a clear danger, the public was alerted, the public was unhappy, changes were made, research was directed into narrower, safer channels, and society went back to doing its thing.
Do you see the difference?
If you’re confused by the biology example, here’s a physics one:
Hm, perhaps you have convinced me that I am omitting something important from my description of my mental models. I don't think my internal mental models are as deeply messed up as our disagreements about these topics suggest they would be. But maybe they are?
I would indeed wave a slightly different magic wand, which is labeled "every government on Earth knows what gain-of-function research is and how to do it, if they didn't already, but my government gets first crack at it for as long as they like and will ask for". But that is a fundamentally different wand than the one you were asking about.
So probably the disconnect in emphasis comes down to my unusual political beliefs and the resulting wild divergence in political priorities.
This last wand being approximately the bargain the Oppenheimer struck with the American government in running the Manhattan Project.
Which argument, if you accept it, has the inevitable consequence that the safest and most moral pivotal act is to solve the alignment problem, solve AI, then post both of them to github in the same short span of a few days.
And to answer the obvious subtext, what if AI is a dangerous technology. This is the one thing that is agreed upon by all members of the alignment community, since that's what it means to be a member of the alignment community.
What's the rate-limiting step for AI? Data, compute, algorithms: all those barn doors have been open for all of human history, and the horse is no longer in the stable as far as I can tell.
The only rate-limiting step left is public trust that AI has any idea what it's talking about. So that's the resource we need the government to limit access to. This is what I plan to do with Peacecraft.ai. It's also the motivation behind my post on the Snuggle/Date/Slap protocol, another one that LessWrong hated.
I’m confused. How do you explain the fact that we don’t currently have human-level AI—e.g., AI that is as good as Ed Witten at publishing original string theory papers, or an AI that can earn a billion dollars with no human intervention whatsoever? Or do you think we do have such AIs already? (If we don’t already have such AIs, then what do you mean by “the horse is no longer in the stable”?)
I think that coordination problems are the only thing preventing that AI from existing, I guess. And coordination problems are solvable.
If I had to guess (which I don't, but I like to make guesses), but if I had to guess, I would say that the NSA is likely going to be the first entity to achieve an aligned superintelligence. And I don't see how they can be more than a day behind OpenAI with all their zero-days. So, you know. That horse is on its way out the stable, and the people who want the horse to go somewhere particular have a lot of guns.
I don't understand what you're trying to say. What exactly are the "coordination problems" that prevent true human-level AI from having already been created last year?
People squabbling about who gets to own it, as exemplified by the OpenAI board chaos.
People afraid of what it might do once it exists, as exemplified by the alignment community's paranoia and infohoarding.
If everyone just released all the knowledge available today (if we were all infopandoras), I predict we would have human-level AI within a matter of months. Do we want that? It is very unclear what people want, and they all disagree. Again, coordination problems.
Perhaps another crux is just one's political intuitions. Most people think of power as a mix of good and bad things. Mine are what I tend to call anarcho-realist - I think power is a mix of bad and good things. This slight change in emphasis causes wildly divergent intuitions and thus proposed solutions to various coordination problems.
I feel like examples would help here, if we can come up with some "safe" examples which are still valid.
One example which comes to mind for me is "Free will". Supposed that such a thing didn't exist and I had a proof, wouldn't this be both useless and dangerous knowledge?
An example which may be more in line with what you have in mind is "A huge solar-flare is coming mid-2025". Or perhaps "The recipe for (dangerous chemical)"? In any case, your argument seem to point to personally useful but collectively harmful information, such that people who hoard this information do so for their own advantage or because they don't trust other people to know it.
I Googled the topic hoping to find examples of infohazards, but instead I found a EA megapost titled "We summarized the top info hazard articles". You might already be familiar with them, but in case you're not, they might have addressed your (and likely my) points already?
I think Free will is a great example. I think its existence is deeply questionable, in that it arguably does exist from a certain point of view (the intentional stance) and arguably not from other points of view (say, a purely physical point of view).
This knowledge seems both useless and dangerous, like you say. People can't do much to change it, so it's useless. People who know the infohazard can presumably use the lack of free will's existence to compel certain behaviors.
Except that's the world we already live in. Governments, authoritarian and otherwise, already compel people to do stuff all the time. Professional philosophers are quite clear with the public that free will is slippery and might well not exist. People generally go about their lives talking as if they have free will, but they certainly don't consistently act like they have free will. It all adds up to normality, as Yudkowsky says.
So I would argue that the existence or lack thereof of free will is actually a strong, strong example in favor of the infopandora approach.
The "recipe for dangerous chemical" example also seems like it doesn't stand up to scrutiny. Everyone knows what it means when your teenage son is taking hour long showers every chance he gets and his socks are super stiff whenever they get washed. Everyone also knows what it means when the local conspiracy nut starts purchasing large amounts of fuel oil and ammonium nitrate and trying to rent a van. Everyone knows what it means when your non-nuclear nation state starts trying to purchase the equipment to machine high-precision aluminum tubes to build centrifuges.
Rather than hoarding knowledge of how to (respectively) masturbate, pull an Oklahoma City, or launch a Manhattan Project, we trust responsible adults and responsible governments to know these things and to control them by controlling access to the rate-limiting step. If the US doesn't want Iran to have nuclear weapons (which seems like a no-brainer for American foreign policy), then it's gonna be assassinating and coercing scientists and whatnot.
Again, it all adds up to normality.
Nothing can be alllll that dangerous if it's known to literally everyone how it works
I agree that seems like a likely point of divergence, and could use further elaboration. If some piece of information is a dangerous secret when it's known by one person, how does universal access make it safe?
As an example, if physics permitted the construction of a backpack-sized megatonne-yield explosive from a mixture of common household items, having that recipe be known by everyone doesn't seem intuitively to remove the inherent danger.
Universal knowledge might allow us to react and start regulating access to the necessary items, but suppose it's a very inconvenient world where the ingredients are ubiquitous and essential.
See my response to Start At The End. I agree that your case would be a counterexample if it were possible. Let's hope it's not possible in a way that's known to any infopandoras!
In this post, I give my rules of thumb for dealing with infohazards, which are as far as I can tell wildly divergent from the accepted views of the larger alignment community. I wrote this post less as a means of propagating my own views, and more because I'm curious as to why my views are so uncommon.
My Infohazard Rules of Thumb
Acknowledgments
I kind of expect this post to be wildly unpopular, just because of how wildly the models presented diverge from the accepted wisdom in the alignment community. So I struggled with the question of whether to cite the people who influenced this post. Ultimately, since I identify as an infopandora, I thought I would just bite the bullet and cite my main influence, which is @jessicata. I've never met her and she doesn't know I exist, but a lot of this mental framework is derived from reading her public writings and just kind of pondering the related questions for a few years.