Exciting to see this up and running!
If I'm understanding correctly, the system looks for modifications to certain viruses. So if someone modified a virus that NAO wasn't explicitly monitoring for modifications, then that would go undetected?
if someone modified a virus that NAO wasn't explicitly monitoring for modifications, then that would go undetected?
That's correct. But it's extremely cheap to monitor an additional virus, so there's not much downside to casting a large net.
Is that 0.2% of people “contributing” to the wastewater? Ie if deployed in an airport, approximately 0.2% of daily airport users being infected might be the threshold for detection? If so, at SeaTac, that would mean around 300 infected users per day would be required to trigger the NAO if I am understanding you correctly.
Technically it's 0.2% cumulative incidence not 0.2% prevalence, but depending on the assumptions you make about how long infections last and how quickly they spread they're usually in the same ballpark.
Many SeaTac travelers do not defecate, so your effective sample size is smaller. Possibly too small for this to work well. This modeling is generally assuming larger sewersheds, like municipalities.
Gotcha. Last I emailed Kevin he was suggesting this would be deployed in airports rather than municipalities. So the plan has changed?
It’s true only a fraction of travelers defecate, but it still seems like you’d need an average of about 300 infected travelers/day in an airport setting to get .2% of the wastewater being from them? Or in a city of 1 million people, you’d need something like 2,000 infected?
suggesting this would be deployed in airports rather than municipalities. So the plan has changed?
We're also exploring arport monitoring, but airplane blackwater tanks not terminals. Preliminary data from pooled tank samples (you collect between the truck that sucks it out of the planes and the dumping point) looks very good.
infected travelers/day in an airport setting to get .2% of the wastewater being from them
Sorry to keep harping in this, but 0.2% of wastewater from people who've ever been infected (cumulative incidence) not currently infected (prevalence). While shedding is primarily about prevalence (though varying over the course of the infection) for evaluating a system we generally think cumulative incidence is more informative because it tells you much more and how far along the pandemic is.
Preliminary data from pooled tank samples (you collect between the truck that sucks it out of the planes and the dumping point) looks very good.
Setting aside economics or technology, would it in principle be possible to detect a variant of concern in flight and quarantine the passengers until further testing could be done?
Sorry to keep harping in this, but 0.2% of wastewater from people who've ever been infected (cumulative incidence) not currently infected (prevalence).
I appreciate the harping! So you're saying that your prelim results show that 0.2% of the sampled population would need to have at some point in the past been infected for the variant of concern to be detectable?
Setting aside economics or technology, would it in principle be possible to detect a variant of concern in flight and quarantine the passengers until further testing could be done?
There are two pretty different scenarios:
Initial detection: if you don't already know whether there's something out there, you'll need to do metagenomic sequencing or something similar to identify the pathogen. This is the part of the problem that the NAO is trying to solve. While I haven't looked into the absolute-minimum-sequencing-time portion of the space deeply, my understanding is if you want a reasonable cost-per read you need to use a sequencing method that (counting both the preparation and the sequencing machine running) takes multiple days. So not a good fit for per-flight testing.
Containment: we've learned about a pathogen somehow (ex: someone with unusual symptoms, metagenomic sequencing) and we're trying to keep it from spreading. Now we can use a targeted method, such as qPCR, where there are stand-alone speed-optimized options here that take under an hour (ex: KrakenSense). In this case, the question is, how do you get the samples to test? Ideally you'd get everyone to give a sample before boarding, which you could do a pooled test on while the plane was in flight, but that requires infrastructure and cooperation with the originating country.
you're saying that your prelim results show that 0.2% of the sampled population would need to have at some point in the past been infected for the variant of concern to be detectable?
That's correct. While detection is fundamentally based on the people who are currently shedding copies of the virus, but our modeling counts "time" in terms of the progress of the infection through the population.
This represents work from several people at the NAO. Thanks especially to Dan Rice for implementing the duplicate junction detection, and to @Will Bradshaw and @mike_mclaren for editorial feedback.
Summary
If someone were to intentionally cause a stealth pandemic today, one of the ways they might do it is by modifying an existing virus. Over the past few months we’ve been working on building a computational pipeline that could flag evidence of this kind of genetic engineering, and we now have an initial pipeline working end to end. When given 35B read pairs of wastewater sequencing data it raises 14 alerts for manual review, 13 of which are quickly dismissible false positives and one is a known genetically engineered sequence derived from HIV. While it’s hard to get a good estimate before actually going and doing it, our best guess is that if this system were deployed at the scale of approximately $1.5M/y it could detect something genetically engineered that shed like SARS-CoV-2 before 0.2% of people in the monitored sewersheds had been infected.
System Design
The core of the system is based on two observations:
Translating these observations into sufficiently performant code that does not trigger alerts on common sequencing artifacts has taken some work, but we now have this running.
While it would be valuable to release our detector so that others can evaluate it or apply it to their own sequencing reads, knowing the details of how we have applied this algorithm could allow someone to engineer sequences that it would not be able to detect. While we would like to build a detection system that can’t be more readily bypassed once you know how it works, we’re unfortunately not there yet.
Evaluation
We have evaluated the system in two ways: by measuring its performance on simulated genetic engineered genomes and by applying it to a real-world dataset collected by a partner lab.
Simulation
We chose a selection of 35 viruses that Virus Host DB categorizes as human-infecting viruses, with special attention to respiratory viruses:
For each virus we generated 1,000 simulated engineered genomes by inserting a random string of 500 bases at a random location within its genome. Then we ran InSilicoSeq 2.0.1 to generate 100,000 read pairs for each virus. We configured InSilicoSeq to use a sequencing error model approximating a NovaSeq, and to generate reads from fragments with a mean length of 187 and a standard deviation of 50. We also generated a set of control sequences, using the same process but without inserting any bases.
After running the pipeline on these 7,000,000 read pairs (100,000 per virus × 35 viruses × treatment vs control) we checked what fraction of reads were flagged as containing a suspicious junction. While normally we filter the output of the pipeline to only junctions that appear multiple times, we didn’t do that for these tests. The issue is that while with a real genetically engineered virus you would expect to see the same junction consistently each time you happened to sequence that portion of its genome, in our tests here we didn’t insert into a single consistent location.
The pipeline flagged 1.2% of simulated engineered reads and 0.003% of control reads. How does 1.2% compare to what the pipeline should be able to do? We made a simple baseline model, where sequencing reads are uniformly distributed along the genome, there are no read errors, and a junction can only be flagged if it is at least 30 bases from the end of a sequencing read. Of the sequencing reads that the baseline model classifies as flaggable, the pipeline flagged 71%. We haven’t yet looked into why this is lower than predicted, and aren’t sure whether it’s the model being overly optimistic or the pipeline missing cases it ought to be able to identify.
This chart shows, for each virus, the fraction of sequencing reads flagged as containing a suspicious junction and the pipeline performance relative to the simple model:
Note that we don’t label the viruses on this chart, because that could be useful to someone in selecting a virus for the kind of attack we are attempting to detect.
Real World Evaluation
We partnered with Marc Johnson and Clayton Rushford at the University of Missouri (MU) to sequence municipal wastewater, primarily RNA, and collected 35 billion paired-end sequences across four runs. The sequencing was performed at the University of Missouri Genomics Technology Core on a NovaSeq 6000 with an S4 flow cell, and covered samples collected between December 2023 and April 2024.
Applying the pipeline to these 35 billion read pairs it flags fourteen collections of reads for manual evaluation. Of these:
It’s possible to tune the pipeline to be more or less sensitive by requiring a different coverage level. If we turn up the sensitivity by flagging things for manual review on the first observation, it flags 12k suspicious junctions, far too many for manual review. If instead we turn down the sensitivity and require three observations it no longer flags any chimeras between gastrointestinal viruses and ribosomal RNA, and only flags the lentiviral vector and three sequences missing from the databases we use to screen out false positives.
We also partnered with Jason Rothman in Katrine Whiteson’s lab at the University of California, Irvine (UCI), in an additional municipal wastewater RNA sequencing effort. From the 25 billion paired-end sequences we’ve analyzed from this group the pipeline flags 64 collections of reads for evaluation. While this is slightly too many for manual evaluation, in spot checking a large fraction are a kind of artifact called “strand-split artifact reads”. These are sequencing reads which switch strand and direction partway through the read. These should be possible to automatically identify and exclude, but this is work we still need to do.
System Sensitivity
We’ve previously estimated that, in a week where 1% of people in a sewershed have contracted Covid-19, with the sequencing and sampling approach in Rothman et al. (2021) one in 7.7M municipal wastewater RNA sequencing reads would be from the SARS-CoV-2 genome, a relative abundance of 1.3e-7.
As noted above, the pipeline flags only an average of 1.2% of simulated engineered reads, primarily because most reads from an engineered pathogen won’t happen to include both the original virus and the modified section. This means that the relative abundance of flaggable reads would be 1.6e-9 if we estimate from SARS-CoV-2.
The most cost-effective sequencer right now, in terms of cost per base, is the Illumina NovaSeq X. It can produce approximately 10B read pairs with a $9,600 flow cell or 25B with a $16,000 flow cell. The flow cell is usually around 3/4 of the total cost, so we estimate $13,000 for 10B and $22,000 for 25B. The sequencer runs for just over a day, and you lose some time to sequencing preparation on the beginning and bioinformatics on the end, so we’ll estimate five days end to end time.
If you were to run a NovaSeq X 25B once a week (approximately $1.5M/y) and needed to see a junction in two different samples before alerting, how many people in your catchment would be sick before the system raised an alert? Running our simulator (blog post, simulator data) we can get some estimates:
With a hypothetical genetically engineered virus that shed similarly to SARS-CoV-2, in the median case the simulator predicts 0.2% of people would have been infected before an alert were raised. Note that we’ve seen speculation that SARS-CoV-2 is unusually well-suited for wastewater-based detection as respiratory pathogens go, and until we have estimates for additional pathogens we’d recommend taking this with a grain of salt.
We also generated estimates for the cheaper but lower output 10B flow cell:
The limit of detection is a bit over twice as high, but the cost only decreases by 40%, so the deeper sequencing option is better if there is sufficient funding available.
Future Work
We see this as a minimum-viable system for detecting genetically engineered pathogens, but there are multiple places we are eager to expand and improve it: