I've done over 200 hours of research on this topic and have read basically all the sources the article cites. That said, I don't agree with all of the claims. I do not think the SARS-CoV-2 virus is very likely to have been created using the RATG13 virus, because of the genetic differences spread out throughout the genomes. However, there are many other paths that could have led to a lab escape, and I'm somewhat agnostic between several of them.
I don't have a lot of time to investigate this further, but if someone was going to spend serious time on it, then I'd be happy have several calls with them, discuss sources & share my notes with them. At this point I think a lab leak is more likely than not, with the strongest piece of evidence being the confluence of the location of the first known outbreak + location of the world's top lab studying SARS-like coronaviruses + absence of related viruses detected nearby + absence of evidence of any other plausible origin.
I highly recommend following Alina Chan on Twitter, who done a lot of interesting work on this question & has appeared to me to be pretty discerning. https://twitter.com/Ayjchan
If I were going to spend a bunch more time on this, I'd try to conduct an estimate using a Bayesian model, probably starting here: https://www.rootclaim.com/analysis/what-is-the-source-of-covid-19-sars-cov-2 and creating my own estimates for each claim + writing out arguments for why.
A random observation I want to note here is the relative lack of good disagreement I've seen around questions of SARS-CoV-2 origin. I've mostly seen people arguing past each other or trying to immediately dismiss each other. This seems true of experts in the space in addition to non-experts. I'd love to see better structured disagreement, i.e. back and forth in journals or other public forums. This might be a good topic for adversarial collaboration.
For context, I have a background in evolutionary theory (though nothing specific to viruses or pathogens) and have recently transitioned from part time to full time research in the longtermist biosecurity space.
When investigating this question, I found researcher's arguments pretty easy to follow, but found some of the claims about ease of engineering to be hard to follow because they often relied on tacit knowledge like "how hard / expensive is it make an infectious clone of a new coronavirus". And some the more technical molecular phylogenetics were difficult as well (what can we infer from dN/dS of various parts of the SARS-CoV-2 vs. RATG13 genomes, and how does selection for codon preference influence this analysis). I'd love to talk with someone who feels like they have a good grasp of either of these areas.
This may be a bit of a pedantic comment, but I'm a bit confused by how your comment starts:
I've done over 200 hours of research on this topic and have read basically all the sources the article cites. That said, I don't agree with all of the claims.
The "That said, ..." part seems to imply that what follows is surprising. As though the reader expects you to agree with all the claims. But isn't the default presumption that, if you've done a whole bunch of research into some controversial question, that the evidence is mixed?
In other words, when I hear, "I've...
With regard to the rootclaim link, I agree that it would be good to try to adapt what they've done to our own beliefs. However, I want to urge some caution with regard to the actual calculation shown on that website. The event to which they give a whopping 81% probability, "the virus was developed during gain-of-function research and was released by accident," is a conjunction of two independent theses. We have to be very cautious about such statements, as pointed out in the Rationality A-Z, here https://www.lesswrong.com/s/5g5TkQTe9rmPS5vvM/p/Yq6aA4M3JKWaQepPJ
How do you reconcile the hypothesis that it escaped from a lab in China with the reports that covid-19 antibodies were found in more than a dozen blood samples taken in Italy in early October 2019, and therefor must have been circulating in Italy in September 2019?
This is fairly convincing that it's a plausible and even likely:
https://nicholaswade.medium.com/origin-of-covid-following-the-clues-6f03564c038
Here is the link to a recent relatively thorough article using Bayesian analysis to argue for a laboratory release of the virus.
https://zenodo.org/record/4477081/files/SQuay_Bayesian%20Analysis%20of%20SARS-CoV-2%20FINAL%20V.2.pdf
A Bayesian analysis concludes beyond a reasonable doubt that SARS-CoV-2
is not a natural zoonosis but instead is laboratory derived
In this context, I'm an ordinary civilian with no specialized knowledge of genetics or virology. I read the linked article, and saw an interesting story built off relevant history and lots of conjecture. I've also read the comments up to this point. Here are the arguments as I see them right now:
SARS-CoV-2 (hereafter "the Virus") must have an origin. Viral origins can be natural or artificial. Natural viruses evolve from other strains, sometimes in animals. The Virus shares a lot of characteristics with a certain bat virus, which is evidence it evolved in bats and then transferred to humans. It is therefore (at least originally) natural. Nobody seems to have any issues up to this point.
The question of how the Virus got to humans has a number of possible answers that seem to boil down to:
In favor of the random mutation possibility:
In favor of engineering, either accidental or deliberate:
It all seems circumstantial to me except the examination of the genome. Even the article started off admitting that there is a distinct lack of hard evidence here. If anything, the balance of evidence looks (to my eye) to be (barely) in favor of a random mutation resulting from the massive exposure of workers to the bat virus. We know that the Virus causes lots of asymptomatic infections, so I see no reason to believe that those workers didn't pass it on before they were hospitalized for their own infections. We know samples of the bat virus spent time in a lab, but the lab time doesn't seem to be necessary for the Virus to reach the general population given the presence of the workers. Given the known behavior of the Virus, I claim it could easily have moved unchecked from Mojiang to Wuhan as a barely noticed, mostly asymptomatic infection the same way we saw it spread mostly untracked through a number of other countries while we were watching for it. It then popped up in Wuhan the same way hotspots have been popping up all over the world ever since.
To clarify, when you say 'originated in a laboratory' do you mean engineered in a laboratory; evolved in an intentionally infected lab animal; or transiently stored in a laboratory?
These are very different hypotheses, but are often conflated.
So far, there is strong evidence that it was not engineered [the spike protein is novel, and not something from the geneticists toolbox], and I haven't seen any evidence that would favor lab storage or evolution over the much larger wild populations of bats and intermediary animal hosts.
I mean to include all the alternatives that involve the virus passing through a laboratory before spreading to humans; so all the options you list are included. There's nothing wrong with asking about the probability of a composite event.
I read this article in NY magazine, on the coronavirus lab-leak hypothesis. I found it fairly convincing, so I think the chance that SARS-CoV-2 passed through some human laboratory before spreading to people is better than not. But when I tried following up by reading other sources, I found myself falling into the trap of trying to just confirm my existing belief. How might we estimate a probability that SARS-CoV-2 passed through a human laboratory at some point before spreading to humans?