The conjunction of "Llama-2 can give accurate instructions for making anthrax" and "Anthrax recipes are hard to obtain, apart from Llama-2," is almost certainly false.
We know that it's hard to make biological weapons with LLMs, because Dario Amodei testified before US congress that the most advanced models that Anthropic has cannot reliably give instructions to make such biological weapons yet. But Anthropic's most advanced models are way, way better than Llama-2 -- so if the most advanced models Anthropic has can't do it, Llama-2 almost certainly cannot. (Either that or anthrax has accurate instructions for it scattered everywhere on the internet and is an unusually easy biological agent to make such that Llama-2 did pick it up -- but again that means Llama-2 isn't particularly a problem!).
I'm sure if you asked an delobotomized version of Llama-2 for instructions it would give you instructions that sound scary, but that's an entirely different matter.
Either that or anthrax has accurate instructions for it scattered everywhere on the internet and is an unusually easy biological agent to make such that Llama-2 did pick it up -- but again that means Llama-2 isn't particularly a problem!
Hard disagree. These techniques are so much more worrying if you don't have to piece together instructions from different locations and assess the reliability of comments on random forums.
Yeah, terrorists are often not very bright, conscientious, or creative.[1] I think rationalist-y types might systematically overestimate how much proliferation of non-novel information can still be bad, via giving scary ideas to scary people.
No offense intended to any members of the terror community reading this comment
We know that it's hard to make biological weapons with LLMs, because Dario Amodei testified before US congress that the most advanced models that Anthropic has cannot reliably give instructions to make such biological weapons yet.
Fwiw I take this as moderate but not overwhelming evidence. (I think I agree with the rest of your comment, just flagging this seemed slightly overstated)
It's a bit ambiguous, but I personally interpreted the Center for Humane Technology's claims here in a way that would be compatible with Dario's comments:
"Today, certain steps in bioweapons production involve knowledge that can’t be found on Google or in textbooks and requires a high level of specialized expertise — this being one of the things that currently keeps us safe from attacks," he added.
He said today’s AI tools can help fill in "some of these steps," though they can do this "incompletely and unreliably." But he said today’s AI is already showing these "nascent signs of danger," and said his company believes it will be much closer just a few years from now.
"A straightforward extrapolation of today’s systems to those we expect to see in two to three years suggests a substantial risk that AI systems will be able to fill in all the missing pieces, enabling many more actors to carry out large-scale biological attacks," he said. "We believe this represents a grave threat to U.S. national security."
If Tristan Harris was, however, making the stronger claim that jailbroken Llama 2 could already supply all the instructions to produce anthrax, that would be much more concerning than my initial read.
Why was "Tristan" qualified to attend but not Eliezer? When is this community going to stop putting up with the denigration of its actual experts and the elevation of imposters?
Ah... well, one perspective is that the world runs substantially on prestige, and rationalists tend not to play that game. There are not many buttons for "but actually you should get serious about listening to the people who have been repeatedly right about very important things way before anyone else was". That is often barely any currency at all in the games about who gets seats at the table.
From this perspective, if one gives up the pursuit of prestige in favor of actually getting things done, if one does not focus on signaling that they are the person who got it done, one is often not the person who gets to take credit or is listened to about the things.
More broadly, getting angry or bitter about not having the respect and power they think that they have earned seems to me like it can cause people to waste a lot of energy for no results. I would be more willing to lean into parts of me that feel angry or bitter about it if I expected it had a decent shot of paying off in terms of correcting the credit allocation in the long-term. I currently expect it does not.
But for instance, on the current margin it seems to me like people have few good ideas for good AI policies whatsoever. I would be proud for rationalists to figure out and share some actually good policies even if those individuals aren't the people who get the credit for coming up with them or implementing them.
[Disclaimer that there are multiple perspectives on the situation and to be clear if a rationalist saw an opportunity to wield more power in an honorable and truthful way that would not warp their epistemic environment and sanity then I would heartily encourage them to do so.]
This comment confuses me.