I also had too-strong priors and "expert" ideas to be properly fox-like in my predictions, and not quick enough to update about how things were actually going based on the data. Because I was slow to move from the base-rate, I underestimated the severity of COVID-19 for too long. I'm unsure how to fix that, since most of the time it's the right move, and paying attention to every new event is very expensive in terms of mental energy. (Suggestions welcome!)
Bottom line: in order to outperform base-rate, somebody somewhere has to do the expensive updates. No way around that. So the options are (a) do the expensive updates yourself, (b) give up and go with base rate, (c) find someone else who's doing the updates. (c) is the obvious answer, but bear in mind that recognizing real expertise is Hard, and you've already noticed just how questionable the "expertise" of many supposed experts actually is.
In the easy case, the nominal experts are in a position where we get reasonably-frequent feedback on their performance (which obviously is usually not the case for rare events). In principle, one might get around that by an expert specializing in noticing rare events across domains.
Thank you - and I strongly endorse this answer. And now that you point this out, I realize that it should have been clear. I have speculated in the past that a large part of the value of Superforecasting is that there are people actually motivated to investigate and do the expensive updating I have also said that I'm unsure how worthwhile it is to pay for the time of the types of people who can superforecast. This seems like a clear case where it is worthwhile, if only it worked.
Given that, I think there's a strong case that we need large rewards for early correct updates away from consensus, especially for very rare events. (In a case like COVID, the value of faster information is in the tens or hundreds of billions of dollars. A tiny fraction of that would be more than enough.) But the typical time-weighted forecast scores don't account for heterogeneous update costs or give sufficient reward to figuring it out a day sooner than the average - though metaculus's score and the scoring Ozzie Gooen has looked at are trying to do this better. This seems very worth more consideration.
people find it far easier to forgive others for being wrong than being right
Harry Potter and the Half-Blood Prince
First of all, I really appreciate this postmortem. Admitting times when you were wrong couldn't have been an easy task, particularly if/when you staked a lot of your identity and reputation to being right. As EA and rationalist individuals and institutions become older and more professionalized, I'm guessing that institutional pressures will increasingly push us further and further away from wanting to admit mistakes; so I sincerely hope we get in the habit of publicly recognizing mistakes early on. (Unfinished list of my own mistakes, incidentally[1]). I hope to digest your post further and offer more insightful thoughts, but here are some initial thoughts:
Addendum on masks:
Another consideration about masks is that masks turn out in practice to be very reusable, a fact we (or at least I) should have investigated a lot more in early March.
On hospital-based transmission:
I don't know how much you believed in it, but as presented, this appears to be merely (ha!) a forecasting error rather than a strategic error. In the absence of a clear counterfactual, I don't think you were obviously wrong here, since it's quite plausible that if a lot of people like you ignored/downplayed the role of hospital-based transmission, it'd have gotten a lot worse.
On being a jerk re Jim and Elizabeth's post:
For what it's worth, I also (privately) asked them to take it down because I had similar considerations to you and thought the thing they wrote about masks was unilateralist-y and a bit of an infohazard. I think I was wrong there. But I think I mostly was object-level wrong about the relative tradeoffs and harms. To the extent I updated now, a) I updated object-level on how much I should cooperate or desire others to cooperate with specific institutions, and b) I updated broadly (but not completely) in general favor of openness and against censorship.
I continue to maintain that if I (and possibly you) had the same object-level beliefs as before, it was not incorrect to consider it an info-hazard (but not all object-level info-hazards are worth suppressing! Particularly if release promotes the relevant meta-level norms more than it harms), though of course not an existential one.
On superforecasting:
You said you think superforecasting is
materially worse than [you] hoped it would be at noticing rare events early.
I don't know how high your hopes were, but for what it's worth, I think this proves too much. I'm not sure about the exact aggregation algorithms that the Open Phil Good Judgement covid-19 project was running, but I feel like all I can realistically gather was that "of this specific set of part-time superforecasters that were on the Open Phil-funded project, more than 50% of them were way too optimistic."
While it's certainly some evidence against superforecasters being good at noticing rare events early, I don't think it's sufficient evidence against superforecasters being able to do this, and I definitely don't think this is a lot of evidence against superforecasting as a process.
As you weakly allude to, if you were on the project and paying attention more, you would probably have done better. Likewise, I know other superforecasters who I think were much more pessimistic than the GJ median. I suspect superforecasters who regularly read LessWrong and the EA Forum would have done better; and if I were to design a better system for superforecasting on rare events, I'd a) prime people to pay attention to a lot of rare events first, and b) have people train and score on log-loss or some other scoring system that's more punishing of overconfidence than Brier.
(All that said, I think Metaculus did okay but not great on covid-y questions relative to what someone with high hopes for prediction aggregation algorithms might reasonably expect).
On US Gov't Institutions:
I think there was a bunch of insights that your policy research experience has colored. For example, you mention how you trusted the FDA to have done a lot better under Scott Goettlieb. This might be obvious to you, but it's something I didn't even really think about until you highlighted this point. You also highlight a lot of useful specific uncertainties about whether the issue was political directors under Trump or nonpolitical directors of specific institutions. I think all of these things are very useful to know from the perspective of a policy researcher like yourself (and for students of US policy), since how to reform institutions is very decision-relevant to you and many other EAs.
That said, at a very coarse level, I think I'm a lot more cynical than you are implying with regard to how well US institutions would have handled this pre-Trump. It's possible we're not actually disagreeing, so I'm curious on your counterfactual probabilities on things being an order of magnitude better (<20,000 Americans dead of COVID-19 by now, say) in the following two worlds:
a) Clinton administration continuing all of Obama's policies?
b) Clinton administration continuing all of Obama's policies except for US CDC in China being equally understaffed as they are in our timeline.
My reasoning for why I'm generally pretty cynical (at least conditional upon this pandemic spreading at all, maybe a larger international presence could have helped contained it early) in those counterfactual worlds[2]:
1) There's sort of an existing counterfactual for preparedness of governments with a broadly American/Western culture but as competent at governance as a typical European country. It's called Europe. And I feel like every large geographically Western country was pretty bad at preparedness? People are praising Germany's response, but when it comes down to it, Germany has 9000+ confirmed covid-19 deaths in a population of 83 million, or >100 deaths/million, despite taking a large economic hit to suppress the pandemic. Japan had <1000 confirmed deaths in a population of 126.5 million. Now Japan was bad at testing, so maybe Japan actually had ~4000 deaths. But even at those numbers (~31 deaths/million), Japan still had <1/3 the number of deaths per capita as Germany. And object-level, Japan seemed to have screwed up a bunch of important things, so there's a simple transitivity argument where if a high-income country did worse than Japan, their policies/institutions couldn't have been that great.
Maybe I'm harping on this too much, but I really don't want us to succumb to the tyranny of low expectations here.
Now some culturally Western countries did fine (Australia, New Zealand). I'm not sure why they did well (maybe it's because they're islands, maybe seasonality is bigger than I think so Southern hemisphere had a huge initial advantage early on, maybe because they're around 10-15% East Asian so people had enough ties to China to be worrying earlier, maybe low population density, maybe their institutions are newer and better, maybe just luck), but regardless, I'd counterfactually bet on the response of Hillary's America looking more like a slightly less competent Europe or maybe Canada and less like Australia/NZ.
2) I didn't look at it that much, but at the high-level, the US response to 2009 H1N1 looked more competent, but ultimately the response didn't seem sufficient to have achieved containment if the mortality rates were as high as people thought it'd be? (Not sure of this, willing to be convinced otherwise on this one).
3) Some inside-view reasoning about specific actors.
___
Anyway, all these gripes aside, thank you again for your thoughtful (and well-written!) post. That couldn't have been easy to write, and I really appreciate it.
[1] Your post actually me to thinking about how I should be more honest/introspective about my strategic and not just predictive mistakes, so thanks for that! I plan to update the list soon with some strategic mistakes as well. For example, I considered myself to be on the "right" side of masks epistemically but not strategically.
[2] I'm maybe 35% on a) and 30% on b). A lot of the probability mass is considerations on there being enough chaos/sensitivity to initial conditions that this pandemic maybe wouldn't have happened at all, rather than Obama's or Hillary's response being an order of magnitude better conditional upon there being an epidemic.
There is a lot here to reply to, and I'm only going to address a few points.
First, on forecasting, I think there is a lot to discuss, and I think Johnwentsworth's comment and my reply are all that I have to say about this for now.
Second, on Government response, I'm also unsure how much we disagree. I definitely think that I have a number of useful insights about institutions, but this is an area where expertise seems to be non-predictive. That means I'm less sure how valuable it is - but I discussed this in more depth here, on Ribbonfarm. That said, I'll make comments anyways.
I agree that many countries were underprepared, but they also historically relied on American leadership for many of these types of events. America was the acknowledged world leader in biodefense and preparation, has spent more time and money on the problem than elsewhere, and has much more money and expertise than most places - so the failure is much more noteworthy than it otherwise would be.
I also think the EU "failures" should be counted as partial successes, since they mostly have case counts declining, and are well prepared to avoid the worst of a possible second wave. That's a solid half credit in an absolute sense, since they seems poised to have gotten it under control before it ended up everywhere, though they didn't catch it enough to prevent spread at first, which would have been the goal. The US (and to a lesser extent, the UK,) didn't manage to control things enough to even get past the first wave, and they are poised to fail to herd immunity in most places - a shocking level of failure, especially given how well other countries have managed this.
For counterfactual predictions, on B, if the US did as well as Germany, Japan, France, and other G-7 nations, they would have kept deaths under 20,000, or at least around there. I'd give at least 50% to keeping it below 20k so far. (I'm unsure how bad the Republican Governors would have made this, or what the rest of the world looks like under Clinton. Would the Chinese have cooperated earlier? Counterfactual predictions this far back are basically about writing an alternative timeline - there are WAY too many potential issues to really consider well.) But the epidemic seems under control in the EU, contra the US. So that seems like the relevant counterfactual. (Aside: It seems non-coincidental, though a surprisingly strong effect, that right-wing populist leaders are especially bad at controlling infectious diseases - BoJo, Trump, and Putin all got this very, very wrong. I think the default reaction of trying to control the narrative over dealing with problems is a particularly dangerous approach with infectious diseases.) And for the A counterfactual, it's similar, but with 20+% probability mass on "this was stopped enough before it left China that there was no pandemic."
I agree with the following points:
I think it's likely our disagreements are somewhat about framing than actual empirical differences. For example, "they seems poised to have gotten it under control before it ended up everywhere, though they didn't catch it enough to prevent spread at first, which would have been the goal" is a phrase I'd use to describe South Korea and Singapore, not Western Europe, where almost every locale had community transmission. I'd use "they caught it enough to prevent spread" to describe places like Mongolia with zero or close to zero community transmission, or contained community transmission to a single region.
I agree that Western European governments should get a lot of relative credit for managing to prevent more deaths, disability, and wanton economic destruction, despite being in an initially bad spot. But thousands of people nonetheless died, and those deaths appeared to be largely preventable (in a practical, humanly doable sense). So while I think we should also a) emphasize the relative successes (because in these dark times it's good to both hold on to hope and be grateful for what we have), and b) be unequivocally clear that the other Western governments mostly did better than the US, I do want to not lose sight of the target and also be clear that the relative failings of the US under Trump does not excuse the lesser failings of other institutions and governments.
I have lots of thoughts, but most of all I am just really grateful for you writing this retrospective. I think it's a really great public service and I've already been thinking a lot about this post since I read it the first time a few hours ago, and expect to continue thinking about it a lot more.
Good post David!
I applaud posts like this. It's a great tradition that we do them.
I believe the conversation about border closures was with me on Twitter?
Are we at a stage yet where we can answer counterfactual questions about what would have happened if all international flights had been shut off in January (Assuming away the political problem)?
Assuming away the political problem of making it stick, it seems clear that without universal border closures by countries, it would have made only a minor difference in spread - most cases that came to Europe, the US, and elsewhere didn't come from China.
If some set of countries were willing to completely shut down all borders, those countries might have avoided infections - might, but I'm skeptical. Even now, the countries that shut down international travel still have a fair amount of international travel, from diplomatic travel to repatriation of citizens to shipping and trucking. So it could plausibly have delayed spread by a month. In places that mounted a really effective response, a month might have made the difference between slow control and faster control. In most places, I think it would have shifted spread a couple weeks later.
Yes it has always been clear to me that shutting down travel from one specific place where you think the disease is is not going to work.
Here in Europe, I have had to postpone personal vacations because the borders have indeed been shut and are only now reopening. Given that Europe actually did close its borders, both external and internal, do you think that it would have been a good idea to go ahead and do that in January, assuming away the political problem?
Again, it didn't actually stop spread - it slowed it slightly. Borders haven't been actually closed. Flights have continued, you just need connections to get a visa. But people have been able to return home - and dual citizens have been able to travel both ways - the entire time.
So do you think that the actual travel restrictions that happened were just a waste of time, and we should have had fully open borders?
Or do you think that the restrictions that we had (late and partial) were the optimal disease-fighting policy (again, neglecting political considerations)?
In general, I think that earlier closures would potentially have delayed spread enough to save lives due to getting vaccines and testing further along than they were.
I'm also claiming that now, with a fully in place and adequate test-and-trace program, including screening for passengers and isolation for positives, border closures have low marginal benefit. Without such a test and trace program, travel modifies the spread dynamics by little enough that it won't matter for places that don't have spread essentially controlled. The key case where it would matter is if the border closures delayed spread by long enough to put in place such systems, in which case they would have been very valuable. And yes, border closures in place have allowed this in some places, but certainly not the US or UK.
So, conditional on the policy failures, I think border closures were effectively only a way to signal, and if they distracted from putting in place testing and other systems by even a small amount, they were net negative.
But what about the ~3 months of lockdown and massive Economic disruption that we had to go through? Don't you that that could have been avoided by closing our borders tightly in January? Do we have evidence to either confirm or exclude that now?
I don't understand the hypothetical.
If every country in the world had closed their borders well enough to stop all movement before it left China, yes, spread would have been prevented. But that's unfeasible even if there was political will, since border closures are never complete, and there was already spread outside of China by mid-January.
Once there is spread somewhere, you can't reopen borders. And even if you keep them closed, no border closure is 100% effective - unless you have magical borders, spread will inevitably end up in your country. And at that point, countries are either ready to suppress domestic spread without closures, or they aren't, and end up closing later instead of earlier.
"Rely more on other people's views in the rationality community" is inherently a bias-variance tradeoff. It's also collectively self-defeating, with a potential to become a death spiral. As such, it may be a reasonable thing to do, but it is potentially very dangerous as a principle to state for others to follow. The selfishly-first-order-rational thing to do would be to do more of it yourself, and encourage others to do less of it; of course, this is free riding, and may not be meta-rational.
My point is: this may be wise, but you should beware, and you should definitely not encourage others to do this without cautioning them against the risks.
One way to beware about this is to ask the question: "which of the biggest differences between rationality community conventional beliefs, and general 'smart non-explicit-rationalist' conventional beliefs, have the weakest support?" I have personal answers to that question but in the spirit of this comment, I won't share them here.
I agree that it could be a death spiral, and think the caution is in general warranted. My personal situation was one where I had fairly little personal interaction with members of the community - though this is likely less true not - but that was why I decided that explicitly considering the consensus opinions was reasonable.
This postmortem is so impressive. Someone should collect all the pandemic related postmortems. I'd be particularly interested in those written by people in the field (broadly construed).
Strong-upvoted! I admire you for writing this.
I said we should be very concerned in January, albeit not very publicly.
The link says:
Initial (naive) estimates of CFR are always overstated because of selection bias for the most serious cases. So our bayesian prior should be a non-trivial proportion of asymptomatic cases. And ignoring this is why we routinely overestimate severity of new outbreaks.
That's not to say that we don't need to be very concerned, but policymakers and public health officials need to be cautious about damaging credibility by repeatedly crying wolf. But the balance between avoiding alarm and ensuring sufficient response is a very difficult one.
I definitely would not have read this as saying "we should be very concerned", if that's one of the things you meant to communicate.
I also followed the herd too much from expert circles, and my twitter feed from infectious disease epidemiology circles was behind even my slow self in recognizing that this was a incipient disaster back in March.
Woah, this is interesting and really alarming.
Because I was slow to move from the base-rate, I underestimated the severity of COVID-19 for too long. I'm unsure how to fix that, since most of the time it's the right move, and paying attention to every new event is very expensive in terms of mental energy. (Suggestions welcome!)
Scott Alexander writes:
Zeynep Tufekci is an even clearer example. She’s a sociologist and journalist who was writing about how it was “our civic duty” to prepare for coronavirus as early as February. She was also the first mainstream media figure to spread the word that masks were probably helpful.
Totally at random today, reading a blog post on the Mongol Empire like all normal people do during a crisis, I stumbled across a different reference to Zeynep. In a 2014 article, she was sounding a warning about the Ebola pandemic that was going on at the time. She was saying the exact same things everyone is saying now – global institutions are failing, nobody understands exponential growth, travel restrictions could work early but won’t be enough if it breaks out. She quoted a CDC prediction that there could be a million cases by the end of 2014. “Let that sink in,” she wrote. “A million Ebola victims in just a few months.”
In fact, this didn’t happen. There were only about 30,000 cases. The virus never really made it out of Liberia, Sierra Leone, and Guinea.
I don’t count this as a failed prediction on Zeynep’s part. First of all, because it could have been precisely because of people like her sounding the alarm that the epidemic was successfully contained. But more important, it wasn’t really a prediction at all. Her point wasn’t that she definitely knew this Ebola pandemic was the one that would be really bad. Her point was that it might be, so we needed to prepare. She said the same thing when the coronavirus was just starting. If this were a game, her batting average would be 50%, but that’s the wrong framework.
Zeynep Tufecki is admirable. But her admirable skill isn’t looking at various epidemics and successfully predicting which ones will be bad and which ones will fizzle out. She can’t do that any better than anyone else. Her superpower is her ability to treat something as important even before she has incontrovertible evidence that it has to be.
The whole article seems worth reading, especially if it's true that epidemiologists under-reacted to this. It's clearly correct that most people shouldn't follow every pandemic closely -- even most epidemiologists shouldn't follow every pandemic closely. But it's important that we get the base level of alarm correct -- it might be correct to overreact somewhat to the vast majority of pandemics, if that's what it takes to avoid underreacting to the big one. And it's important that people be very explicit about how carefully they've been looking into this or that specific pandemic, so that we can collectively know which epidemiologists and other observers to pay the most attention to.
I think Tyler's way too impressed by himself and his discipline than he should be. There's a saying about economists making fortune tellers look good that seems appropriate here. And he probably shouldn't be posting insulting things about epidemiologists in the same breath as saying most economists are just as bad - which he followed up with saying he wants to be rude by asking questions he could have spent half an hour googling - he hadn't even done basic research. I also think that people on lesswrong give too little credit to public health officials for being properly cautious about overreacting, especially given that even for COVID-19, many people are saying that we went too far, and the economic harms were not worth the damage.
Also see this thread: https://twitter.com/davidmanheim/status/1235274008142270466
Next, should academics and public servants in epidemiology simply be paid more? No, and no. If anything, there is not enough disincentive to enter academia, since there are so many more good applicants than spots, across disciplines. Something else needs to be fixed there first. (Everything, actually.) And government isn't set up well to pay people more in ways that gets better candidates - doubling salaries wouldn't be enough to get anyone more competent to run for the Senate, much less be a senior government appointee, unless they already wanted to do that and didn't actually care about the money. (There are other ways we underpay and sabotage government that money could fix, but that's a different discussion.) And I'm surprised that an economist doesn't know enough about these structures to see why higher pay isn't a useful lever.
A thing I regret not thinking of is that ventilators aren't as crucial as was expected because they're dependent on the long tissue being healthy.
I'm not an expert, but it's so obvious. I don't know how to avoid making that sort of mistake. Maybe being careful about tracking chains of causation.
Why should this have been obvious? Invasive mechanical ventilation is much more helpful for typical ARDS than for COVID-19-style ARDS and other COVID-19 dysfunction. What's the earliest evidence that should have strongly updated us in that direction?
It's true that we learned more about the type of lung damage as things went on, but I still feel like that ventilator conversation was really implausible in hindsight. I'm not an expert, but experts seemed suspiciously quiet, and it should have been obvious to many of them that there were major practical concerns. Accounts from other countries seemed to suggest that ventilators were a poor choice for a significant number of COVID-19 patients, but all our resources seemed to go in that direction, rather than the seemingly obvious fact that you have to keep it out of the nursing homes rather than send people to nursing homes to clear beds for ventilator patients.
The average nursing home resident will not survive ventilation. I remember reading an interview with an Italian doctor saying he'd never put his elderly father on one. We knew COVID-19 damaged lungs, and that lung damage complicates ventilation. I caught on relatively early that they were being overhyped only because I stumbled across two online accounts by technicians trained to operate ventilators, which is apparently a pretty delicate task that most healthcare workers aren't great at, especially in these severe and unpredictable cases. There clearly weren't enough of them to put vast numbers of people on ventilators, and ventilators are serious equipment, with serious effects and high fatality rates, to be used as a last resort, not the panacea they were portrayed as. It seemed like a distraction from more practical attempts we could have taken to improve the overall situation. The average person can be forgiven for not seeing this, but even just reading about SARs should have been enough to raise more questions in my mind.
As I understand it, the purpose of a ventilator is to make up for a person's inability to move sufficient air in and out of their lungs, but it assumes that the lungs, if given air, don't have a problem with getting oxygen into the bloodstream.
As mentioned in the post, I think it's personally helpful to look back, and is a critical service to the community as well. Looking back at looking back, there are things I should add to this list - and even something (hospital transmission) which I edited more recently because I have updated against having been wrong about in this post - but it was, of course, an interim postmortem, so both of these types of post-hoc updates seem inevitable.
I think that the most critical lesson I learned was to be more skeptical of information sources generally - even the most accurate, including superforecasters and the rationalist community, are fallible in ways which are somewhat predictable, and hard to evaluate prior to knowing the ground truth. This both highlights the value of staying uncertain and entertaining multiple hypotheses, and the importance of keeping diverse information sources available. The points made by John Wentworth in his comment about the need to do expensive updates was also very clear and valuable.
I certainly think additional posts of this type, by myself and by others, would provide value - and I could see it being its own genre. Unfortunately, there have been very few. I am happy to see several projects looking back at the community's reactions, successes, and failures, but they are still in progress. The 2020 Petrov Day postmortem and similar are also evaluating community behavior, and some have evaluated failures in companies, but I see fairly few, and I would think we could use more, and more individual posts. (I'd hoped to write another actual after-action report, but I have been busy - an insufficient excuse - and we're unfortunately still not post-COVID-19.)
Thanks for this!
Paragraph with confusing wording:
In retrospect, I think it would have been better, consequentially, to push for cloth masks earlier, but current modeling and our understanding of spread make it clear that mask wearing by itself is only marginally effective.
Do you mean our present day understanding, or our understanding at the time? Do you mean that you still think masks are only marginally effective, or thought so at the time?
I mean now - it's clear that masks are not particularly effective at preventing people from getting COVID, and are somewhat but not very effective at preventing people who have COVID from infecting others. That's enough to be incredibly important at a population level, which is obviously a key thing to do, but it's not anything similar to what proponents had been claiming.
Very interesting/useful.
I suggest it is important to separate the desirability of a course of action and its political feasibility e.g. in relation to border closures.
In epidemiology it is a basic fact in the 101 textbook that slowing long distance transmission (using quarantines / travel restrictions) is very important. Unfortunately this got caught up in claims of xenophobia etc. Countries that have been relatively successful have implemented such restrictions.
I would be interested in some justification of the claim that face masks are not very useful. From all my reading, this seems to be false.
One mistake I made was not to aggressively look for countries that were successful (like Taiwan) and to enquire what they did (border closures/tightly enforced quarantine, face masks, isolating people with cold/flu/fever symptoms - even though this is not a "valid" test for CV it gathers and uses much useful information).
Like many I got caught up in the false dichotomy of lockdown=ruined economy versus no-lockdown=many will die.
You said that "In epidemiology it is a basic fact in the 101 textbook that slowing long distance transmission (using quarantines / travel restrictions) is very important." The parentheses make the statement incorrect. Obviously there are discussions of this, but I just checked my copy of "Modern Infectious Disease Epidemiology: Concepts, Methods, Mathematical Models, and Public Health." It discusses travel and the contribution to spread, but mostly focuses on the way IHR limits the imposition of travel bans, and why such bans are considered problematic. It does mention quarantines and travel restrictions, but they aren't the key tools that are recommended.
Also, you said "I would be interested in some justification of the claim that face masks are not very useful." That isn't what I said. I said that "mask wearing by itself is only marginally effective." See this FHI paper, which estimated, albeit with very low confidence, that mask policies were almost entirely ineffective - far more pessimistic than my claim. That is because that paper is likely to be understating the impact, as they admit. It seems clear that maks wearing reduces spread somewhat, but note that this is because of reducing spread from infectious individuals, especially pre-symptomatic and asymptomatic people, not protecting mask wearers. The early skepticism was in part based on the assumption, which in March seemed to have been shared by both promoters and skeptics, that the benefits were that masks were individually protective, rather than that they helped population-level spread reduction. It turns out that (contra the FHI paper,) there seems to be some impact helping spread reduction. Even so, it's not enough to bring R<1 without other interventions, either closures, or an effective test and trace program, as our forthcoming paper argues. (I will also note that one key thing that is changing from that pre-print version is because reviewers pointed out that we were likely too optimistic in our estimate of mask effectiveness, and the literature supports much smaller impacts.)
EDIT: I notice I am confused about why people downvote comments that make substantive points without replying. If the tone or substance is problematic, I certainly think downvotes are acceptable, but I think the norm is supposed to be that you also tell people what you think they did wrong.
It seems clear that maks wearing reduces spread somewhat, but note that this is because of reducing spread from infectious individuals, especially pre-symptomatic and asymptomatic people, not protecting mask wearers. The early skepticism was in part based on the assumption, which in March seemed to have been shared by both promoters and skeptics, that the benefits were that masks were individually protective, rather than that they helped population-level spread reduction.
The early *arguments* I saw were mainly about whether masks meaningfully reduced the wearer's chances of getting infected. But it was already conventional wisdom that masks did meaningfully reduce the wearer's chances of infecting others, people just weren't taking the next step of arguing for general mask use on these grounds. For example, the early March CDC recommendation (linked in the anti-CDC LW post) was:
CDC does not recommend that people who are well wear a facemask to protect themselves from respiratory diseases, including COVID-19.
Facemasks should be used by people who show symptoms of COVID-19 to help prevent the spread of the disease to others. The use of facemasks is also crucial for health workers and people who are taking care of someone in close settings (at home or in a health care facility).
By mid March, there were organized efforts to increase mask use on the grounds that it reduced the wearer's chances of infecting others. The Czech government (which mandated mask use on March 19) and the #Masks4All campaign were the most prominent ones that I saw - both encouraged people to make their own cloth masks and used the slogan "My mask protects you, your mask protects me" (they may also have talked about some risk-reduction benefits for the wearer). A quick search turns up this March 14 video (in Czech, with English closed captioning available) as the earliest source I could quickly find clearly making this case for widespread mask use.
Yes - it took me until mid or late March to be fully on board. See my comment here to a post arguing for pushing handwashing instead of suggesting masks, which I changed my mind about in mid to late March.
I know the conversation these days is (rightly) about preventing presymptomatic transmission from the wearer, but I'm personally still at ~80% that masks probably protect the wearer at least a little, though agree that the effect may not be huge.
Obviously there are discussions of this, but I just checked my copy of "Modern Infectious Disease Epidemiology: Concepts, Methods, Mathematical Models, and Public Health." It discusses travel and the contribution to spread, but mostly focuses on the way IHR limits the imposition of travel bans, and why such bans are considered problematic. It does mention quarantines and travel restrictions, but they aren't the key tools that are recommended.
Could you expand on what arguments they present?
Background / my current take:
The past year I have been reading a little bit about this received wisdom in epidemiology (Quarantines and travel restrictions do not work! Or people should not do that because they are too costly economically/because of human rights!), and in my view I have downgraded the profession's scientific credibility accordingly (that is, failing at rationality), as I have had difficulties finding the actual arguments with numbers and models instead of review articles which say this kind of things as conclusive and cite something which does not appear all that conclusice to a sceptical reader (Usually: airport temperature-taking in Asia during SARS did not work, and Spanish flu eventually reached Australia after several months of not spreading there.)
In contrast, going by my understanding of basic maths, it seems foregone conclusion that if one has limited test&trace capability, limiting introduction of new infectious cases will be helpful for the available capacity to contain new clusters. The amount of help depends on parameters of the measures taken and the disease itself, so it does not help always to great effect. NZ provides a plausible example that it was a helpful move for containing this particular disease. Likewise IIRC WHO and similar bodies have apparently pledges and such not to implement travel restrictions, and such universal policy decisions scream "ideological" to me. The cost-benefit calculus on these matters is not for some group of academians to dictate anyway. Nor its their job to state what is politically impossible or possible. Yeah, right, surprisingly many things become politically possible this year. (Such things happen infrequently, but they do happen.)
Other mind-boggling decisions by epidemiological elite here in Finland (that influence my position here) include the conclusion that "if we think clusters have become an uncontrolled epidemic, we will just cease all tracing and other similar efforts", and "we have this mathematical DE model where we assume we know exactly all the parameters. So if all restrictions influencing R are removed in November, it proves that we will have horribly deadly second wave in November/December unless we actually help the disease a little bit to spread in this R range, for herd immunity you see, trust me we are epidemiologists" (publicized in newspapers, "scientists say that we have horrible second wave in November if we stop the virus too well"). Presumably similar reasoning resulted in our central government department on at least one occasion outright forbidding some regional authorities from testing incoming travellers from Italy at the very moment the test personnel to was going to the airport and they had made media statement starting testing.
edit. Clarification
First, in that comment, I wasn't arguing that quarantines aren't helpful. I said that the parentheses make the claim false; "In epidemiology it is a basic fact in the 101 textbook that slowing long distance transmission (using quarantines / travel restrictions) is very important." You seem to agree that this is the received wisdom.
And I agree that we should have done border closures earlier, but I would note that the simple counterfactual world, where people in general ignore epidemiologists more often, is far worse than our world in many ways. I think a world where border closures could be done at the drop of a hat would be worse in other ways as well. You can argue, correctly, that only doing closures when actually necessary is better, but I don't think breaking down the norm of not banning travel would be a net benefit. (See: Chesterton's fence, and for a concrete example, see China's ongoing internal and external travel restrictions, and how that enables concentration camps in Xinjian.)
In my view I have downgraded the profession's scientific credibility
I agree with you that the current failure should make your downgrade your opinion of experts somewhat. But see above about what I think of ignoring epidemiologists more often in general.
"it seems foregone conclusion that if one has limited test&trace capability, limiting introduction of new infectious cases will be helpful for the available capacity to contain new clusters"
Agreed, but there was no reason to have limited test and trace resources. More recent articles confirm that we could have done symptomatic tracing - loss of smell, coughing, etc - and isolation of just those cases, and shut down transmission completely without any testing. Shutting down borders helps, especially without sufficient tests, but it should not have been needed.
"Other mind-boggling decisions by epidemiological elite here in Finland..."
I can't comment on Finland specifically, but think that your local elite was probably less unanimous at the time, and the international consensus was different as well.
"if we think clusters have become an uncontrolled epidemic, we will just cease all tracing and other similar efforts","
Yes, if spread grows too large, tracing + quarantines is in fact not worthwhile, and shutdowns will be cheaper. (You can play with a basic DE model and put costs on tracing to convince yourself why this is true.)
And yes, removing all restrictions does lead to a rebound and worse spread later. Just look at the US.
Yes, if spread grows too large, tracing + quarantines is in fact not worthwhile, and shutdowns will be cheaper. (You can play with a basic DE model and put costs on tracing to convince yourself why this is true.)
Yeah, I tried to imply the problem was in my eyes the flimsy evidence they had a correctly specified model for making that decision. In reality, they didn't stop tracing at any point (I am not sure but looking at news, the public pressure supported by non-epi computationally oriented scientists might have helped. I hope they will do proper post-mortem afterwards.)
Otherwise, I think point by point response is not necessary. I would stress that I have downgraded my evaluation of epidemiology to the extent that instead merely trusting that "this is what epidemiology profs or textbooks say", one should review the actual arguments and evidence
This is usually correct, but here it was a mistake. (I now think that superforecasting is materially worse than I hoped it would be at noticing rare events early.)
This seems to be an interesting point. If you do believe that relying on GJP forecasters was a mistake, what's the problematic heuristic? And how can you tell in future whether to trust them?
See the back-and-forth with John Wentsworth in the comments earlier - https://www.lesswrong.com/posts/B7sHnk8P8EXmpfyCZ/a-personal-interim-covid-19-postmortem?commentId=ntGR3rpnSW6yKRoAP
I think it's important to clearly and publicly admit when we were wrong. It's even better to diagnose why, and take steps to prevent doing so again. COVID-19 is far from over, but given my early stance on a number of questions regarding COVID-19, this is my attempt at a public personal review to see where I was wrong.
I have been pushing for better forecasting and preparation for pandemics for years, but I wasn't forecasting on the various specific questions about Pandemics on most platforms until at least mid-March, and I failed in several ways.
Mea Culpa
I was late to update about a number of things, and simply wrong in some cases even on the basis of known information. The failures include initially being slow to recognize the extent of the threat, starting out dismissive about masks, being more concerned about hospital-based transmission than ended up being justified, being overconfident in the response of the US government, and in early March, over-confidently getting a key fact wrong about transmission being at least largely via aerosol droplet versus physical contact. I have a number of excuses, of course. Most other experts agreed with my views, my grandfather passed away in January, followed by his wife in early March, I was under a lot of stress, I was very busy with my personal life, I was trying to do a number of other high-priority projects, I was not paying attention to the details, and so on. But predictive accuracy doesn't care about WHY you were wrong, especially since there are always such excuses. And the impact of my poor judgement was also likely misleading to others in the community.
At the same time, I feel the perhaps egotistical need to note where I was correct early, and what I got right - followed by a clearer description of my failures. I started saying there would be PPE shortages due to COVID-19 by January, and was writing about the supply chain issues well before COVID. I submitted this paper November last year with Dave Denkenberger, which was largely finished last summer, and it was accepted in February, which then took 3 months to get published. The delay was in part due to other demands on my time, but in retrospect, if it had been available 3 months earlier, it would have been far, far more impactful.
I also understood the failure mode we ended up seeing, and in my 2018 paper, discussing overconfidence in claims that pandemics would be rare, I argued that among the most critical risks was failure to respond to emerging pandemics which could in theory be controlled quickly enough. On the other hand, my failure to realize that this is exactly what was happening is perhaps compounded by the fact that I understood the dynamics, and should have been able to identify what was going on.
Lastly, I maintain I was correct in warning about the poorly thought out and in some cases outright dangerous "preparation" in some quarters of the rationality community proposed in March, such as advocating use of bleach and ozone in closed areas for disinfection. Some people in the community were stockpiling N-95 masks and food and buying up second hand ventilators, and as I said at the time, were at best being selfish and defecting. On the other hand, as I mention below, I was insufficiently clear about the need for better preparation, and waited far too long to speak.
Some of My Mistakes, and Related Comments
Slow to recognize the extent of the threat.
I said we should be very concerned in January, albeit not very publicly. I took until early March to start suggesting that it was clear that the US would expect to see large numbers of deaths. I was skeptical of valuable efforts early on, and didn't start really publicly sounding the alarm and reacting until even later. I was later than most of this community in recognizing the risks.
Skeptical about Border Closures
In a conversation that started Jan 27th, I was asked about shutting down borders to prevent spread. I was dismissive, in large part based on the expert consensus. I'm unsure whether this was a mistake on the object level, since I think that at that early point, the facts were unclear enough, and trade wars really are bad. I also expected response to be better, based on previous cases.
I do not think that border shutdowns were feasible, and historically they have not been. Quarantines at borders were and are logistically impossible. And full border closures for COVID-19 were also not very effective most places until very late in the spread, (Mongolia and Vietnam are the exceptions that disprove the rule.) Even late in the pandemic spread, lots of transmission occurred from places where there had been few or no cases at the time people entered. However, when discussing it, I excused my early claims that it was too economically damaging and would have been ineffective by substituting a different argument about political feasibility - one which I think is correct, but was not my original consideration. This was bad epistemic practice, and I should have been clearer that in retrospect, if they could have been put in place, travel bans would have been a much better idea. I still think my later excuse, that they were politically impossible, holds up - but I had not fully thought through the question until well after my early response.
Dismissive about masks.
The research on use of masks was unclear and I don't want to claim it was retrospectively obvious, but as a matter of decision making given uncertain risks, people should have started wearing homemade masks in public much earlier. We will still need to see how much impact promoting mask wearing in public has had, but at the very least it functioned as a clear and important public signal that COVID was serious, which promotes physical distance and other critical factors.
On the other hand, I said at the time, and still maintain that I was correct in suggesting that buying up P95 and surgical masks in February and March was defecting, since it was already clear that those supplies were needed desperately in hospitals. And Fauci has now said as much (as a level-1+2 sage, in my view.). In retrospect, I think it would have been better, consequentially, to push for cloth masks earlier, but current modeling and our understanding of spread make it clear that mask wearing by itself is only marginally effective. I was instead focused on promoting handwashing, which I think is still undersold in importance, and thought that continued focus on masks would be a net negative. I was wrong, and others here were correct.
Not clear enough about the importance of preparation.
I've long said, following all of the experts, that people should have 2 weeks supply of food and basic supplies. Especially people in California, where earthquakes are far more common than severe pandemics. Further preparation should have been unneeded early on - but in fact, most people don't do this, and the people who were advocating making sure that you were prepared for a worse outcome were correct.
On the other hand, there is an argument I've seen here, and by others in the rationality community elsewhere, that encouraging people to buy critical supplies and hoard early in a crisis sends a price signal to get companies to produce. The argument is that this type of hoarding masks and other PPE will convince manufacturers to make more. I thought, and still think, that this is at least partly misunderstanding the way that price signals and supply chain delays propagate. Anyone who's familiar with MIT System Dynamics' Beer Game and the bullwhip effect would tell you that companies that ramped up production in response to demand quickly (rather than projections and an understanding of longer term demand) were being stupid, not prudent, and companies that tried this in exactly this area were burned in the past for doing so. If that isn't clear enough, notice that it took a couple months for the toilet paper and flour "shortages" to be worked out, despite the fact that there was sufficient supply, and there were not actual production supply shortages. Yes, markets are largely efficient, but they aren't magical ways to eliminate production and distribution delays, much less to insulate companies from actual market dynamics - and China and other southeast Asian countries had already stepped up mask production massively by mid-January. Most of the current supply comes from those factories, so the supposed benefits of price signals from buying masks in February seem not to have been actually effective in speeding anything up.
Oversold Hospital-based transmission.
Part of my concern about hoarding of masks and other equipment was that I thought we would once again see a pattern of large transmission events being centered around hospitals. Thankfully, this didn't happen - hospitals have gotten far better at isolation of patients, and they shut down non-essential services early. We did still see many, many cases and deaths in hospital staff, and this was very clearly in large part due to a lack of supply of PPE. Still, it wasn't the critical locus of spread I expected it to be.
Overconfidence in the response of (certain agencies in) the US government.
This was a huge mistake on my part. I have been concerned about the current administration for years, have repeatedly warned that it is destroying government agencies. Despite that, I was (in retrospect very unreasonably) still confident that the CDC was going to handle the situation well. They had handbooks on influenza pandemic preparedness, I had personally discussed pandemic preparedness plans with senior people at CDC just a few years ago, and I was overconfident in the ability to respond. Based on that, in turn, I was confident that the level of concern being voiced by the CDC was a reflection of their planning and ongoing preparation. The CDC has planned for preparation for this exact case for years, and I assumed they would carry out those plans. I was wrong.
It seems, though it is still somewhat unclear, that center directors were told by the director and the head of HHS that they needed not to speak out about the risks, specific recommendations were vetoed, and (easily the worst screw up,) they let the FDA ban private tests, seemingly at the direction of the administration, to hide the extent of the spread. I'm still confused by the level of non-reaction among non-political SES staff and GS-14s. We have seen many people in various agencies come forward with complaints during this administration, but CDC seems to have just dropped the ball on their response. We will likely see in the coming years how much this was due to central directives not to react, versus alack of central directives to react, therefore failing due to passivity. I still want to assume the former, but that's in large part self-justification of my prior views.
I was wrong in trying to defend the CDC's overall response in March. It definitely isn't as clear as I thought at the time that they were, and would be, net positive. I do think that the emergence of Fauci as almost a national hero has been very helpful in getting people to listen to expert recommendations, even if this did come very late. This is a point on the side of getting most people to listen more and attack less. On the other hand, Lesswrong was overall better prepared because of their skepticism, so at the very least I was talking to the wrong crowd to defend them, and more likely should have been quicker to judge their actions as dangerous myself.
The FDA also surprised me with how badly they did, albeit the surprise was less severe because I had lower expectations. I thought they were getting less dangerous to US public health given the previous pushes to reduce regulation by the current administration. Scott Gottlieb was there for two years, and was probably the only Trump nominee I was actually super-happy about. Unfortunately, he left (a fact I wasn't paying attention to,) and it turns out that the incompetence of a sequence of new directors and rapid changes left the FDA even less prepared that they would have been. I would have expected a doctrinaire Republican appointee to seize the opportunity of a crisis to reduce regulation, and instead it seems they did nothing but block critical testing work for months.
I've long considered myself skeptical of government agencies abilities, and lean fairly heavily libertarian in many ways - albeit less than most others at lesswrong. I was still surprised by the level of ongoing, perhaps even malicious incompetence of the current administration. I'm still unclear if this is a Hanlon-dodge, or if they really have broken the US government so badly, so quickly. Other governments managed this far less poorly, so I'm unclear how generalizable the lesson is that governments are bad at everything. But I am glad I left the US.
Being a jerk commenting on a post attacking the CDC
Given that I'm posting a retrospective, there is a different type of mistake I made that I also need to address. In a lesswrong thread several months ago, there were a number of claims made about the CDC's response. I responded that I thought the post was an infohazard, would very plausibly lead to many more people dying, and as such, the posters should have asked for feedback from someone who could vet concerns about this, and that it should be taken down by site administrators. This was stupid, and I have apologized there, along with laying out what I hope is a fair analysis of what I know I did wrong, and what I still think I was correct about.
Speculation about Causes
There are lots of things I did wrong.
First, I think I was too close to the situation. I had spent a ton of time looking at the US's system specifically, and writing about the closely related -topic of influenza pandemics in my dissertation, then doing work for Open Philanthropy on GCBRs. All of this was during the Obama administration. I left the US a bit after Trump was elected, partly for that reason, and worked on related topics that had less to do with US policy. I'd like to say that's why I didn't update, but to be honest, I think I was just being stupid in accepting my cached thoughts about the risk and best responses, instead of re-evaluating.
I also had too-strong priors and "expert" ideas to be properly fox-like in my predictions, and not quick enough to update about how things were actually going based on the data. Because I was slow to move from the base-rate, I underestimated the severity of COVID-19 for too long. I'm unsure how to fix that, since most of the time it's the right move, and paying attention to every new event is very expensive in terms of mental energy. (Suggestions welcome!)
I also gave too much weight to others' forecasts. Good Judgement's predictions were WAY optimistic about this early on, and I was not forecasting the question, but I was assuming that their aggregate guess was better than that of individuals, especially people who aren't forecasters. This is usually correct, but here it was a mistake. (I now think that superforecasting is materially worse than I hoped it would be at noticing rare events early.) I also followed the herd too much from expert circles, and my twitter feed from infectious disease epidemiology circles was behind even my slow self in recognizing that this was a incipient disaster back in March.
Conclusion
COVID-19 went badly in some places, and went disastrously in others. This was largely predictable, and I failed to notice early enough. (The US is in deep, deep trouble, and this will continue for quite a while longer, with myriad longer term effects on the global economy, and on global stability of other types.) I'm chastened about the poorly calibrated overconfidence of my expert opinion.
I'm also partly unsure what the best next steps are for better-calibration. One key thing I did, several years ago, was explicitly try to rely more on other people's views in the rationality community to guide my decisions, and provide a clear source of feedback. I didn't do this as much as I should have in this case. (On the other hand, it was a large part of why I recognized the mistake as quickly as I did, albeit later than I could have - so it was at least a partial success.)
I'm hoping that this exercise is another way in which thinking through the situation gives me a valuable chance to reflect, and that I can get further feedback. I also hope that it's useful for others to perhaps learn from, but I'm unsure how transferable the lessons of my failures are.