All of Hauke Hillebrandt's Comments + Replies

Is OpenAI gaming user numbers?

Gdoc here https://docs.google.com/document/d/1os0WNmJ-O1eEGeKr543nkemnXbTmYkE2sC-t51c9OE4/edit?tab=t.0

Some have questioned OpenAI's recent weekly user numbers:[1]

Feb '23: 100M[2]

Sep '24: 200M[3] of which 11.5M paid, Enterprise: 1M[4]

Feb '25: 400M[5] of which 15M paid, 15.5M[6] / Enterprise: 2M

One can see:

  • Surprisingly, increasingly faster user growth
  • While OpenAI converted 11.5M out of the first 200M users, they only got 3.5M users out of the most recent 200M to pay for ChatGPT

Where did that growth come from? It's not from ... (read more)

5Daniel Kokotajlo
This user growth seems neither surprising nor 'increasingly faster' to me. Isn't it just doubling every year?  That said, I agree based on your second bullet point that probably they've got some headwinds incoming and will by default have slower growth in the future. I imagine competition is also part of the story here.

Yes, good catch, this is based on research from the World Value Survey - I've added a citation.

I checked. It's 0.67.

 

This seems to come from European countries.

7Algon
Good point. I grabbed the dataset of gdp per capita vs life expectancy for almost all nations from OurWorldInData, log transformed GDP per capita and got a correlation of 0.85.

Yeah I actually do cite that piece in the appendix 'GDP as a proxy for welfare' where I list more literature like this. So yeah, it's not a perfect measure but it's the one we have and 'all models are wrong but some are useful' and GDP is quite a powerful predictor of all kinds of outcomes: 

In a 2016 paper, Jones and Klenow used measures of consumption, leisure, inequality, and mortality, to create a consumption-equivalent welfare measure that allows comparisons across time for a given country, as well as across countries.[6] 

This measure of huma... (read more)

2Algon
I checked the paper and it looks like they're comparing welfare by "how much more would person X from the US have to consume to move to another country i?" Which results in equations like this: which says what the factor λsimplei ,  should be in terms of differences in life expectancy, consumption, lessure and inequality. So I suppose it isn't suprising that it's quite correlated with GDP, given the individual correlations at play here, but I am suprised that it is so strongly correlated since I'd expect e.g. life expectancy vs gdp to correlate at maybe 0.8[1]. Which is a fair bit weaker than a 0.96 correlation! 1. ^ I checked. It's 0.67. 

You can compute where energy is cheap, then send the results (e.g. weights, inference) on where ever needed.

But Amazon just bought rented half a nuclear power plant (1GW) near Pennsylvania, so maybe it doesn't make sense now.

2Matt Goldenberg
i don't think the constraint is that energy is too expensive? i think we just literally don't have enough of it concentrated in one place but i have no idea actually

cf

The Bootleggers and Baptists effect describes cases where an industry (e.g. bootleggers) agrees with prosocial actors like regulators (e.g. baptists) to regulate more (here ban alcohol during the prohibition) to maximize profits and deter entry. This seems to be happening in AI where the industry lobbies for stricter regulation. Yet, in the EU, OpenAI lobbied to water down EU AI regulation to not classify GPT as 'high risk' to exempt it from stringent legal requirements.[1] In the US, the FTC recently said that Big Tech intimidates competition... (read more)

Hanson Strawmans the AI-Ruin Argument

 

I don't agree with Hanson generally, but I think there's something there that rationalist AI risk public outreach has overemphasized first principles thinking, theory, and logical possibilities (e.g. evolution, gradient decent, human-chimp analogy, ) over concrete more tangible empirical findings (e.g. deception emerging in small models, specification gaming, LLMs helping to create WMDs, etc.).

1dr_s
Specifics are just that - specifics. They depend on the details of any given technology, and insofar as no AI for now has the power to self-improve or even come up with complex plans to achieve its goals, they're not particularly relevant to AGI, which may even use a different architecture altogether. To me it seems like the arguments remain solid and general, the way, say, the rocket equation is, even if you don't specifically know what your propellant will be. And like for that time Oppenheimer & co. had to worry about the possibility of igniting the atmosphere, you can't just go "oh well, can't possibly work this out from theory alone, let's roll the dice and see".
3DaemonicSigil
I tend to agree with this, I was trying to gesture at the various kinds of empirical evidence we have in the paragraph mentioning Bing, not sure how successful that was. The situation is quite interesting, since Eliezer was writing about alignment before a lot of this evidence came in. So first-principles reasoning worked for him, at least to the point of predicting that there would be alignment issues, if not to the point of predicting the exact form those issues would take. So many rationalists (probably including me) tend to over-focus on theory, since that's how they learned it themselves from Eliezer's writings. But now that we have all these examples, we should definitely be talking about them and learning from them more.

AI labs should escalate the frequency of tests for how capable their model is as they increase compute during training

Comments on the doc welcome.

Inspired by ideas from Lucius Bushnaq, David Manheim, Gavin Leech, but any errors are mine.

— 

AI experts almost unanimously agree that AGI labs should pause the development process if sufficiently dangerous capabilities are detected. Compute, algorithms, and data, form the AI triad—the main inputs to produce better AI. AI models work by using compute to run algorithms that learn from data. AI progresses due t... (read more)

ARC's GPT-4 evaluation is cited in the FT article, in case that was ambiguous.

2Ben Pace
Thanks, I was confused that I couldn't find it.

Agreed, the initial announcement read like AI safety washing and more political action is needed, hence the call to action to improve this.

But read the taskforce leader’s op-ed

  1. He signed the pause AI petition.
  2. He cites ARC’s GPT-4 evaluation and Lesswrong in his AI report which has a large section on safety.
  3. “[Anthropic] has invested substantially in alignment, with 42 per cent of its team working on that area in 2021. But ultimately it is locked in the same race. For that reason, I would support significant regulation by governments and a pr
... (read more)
4Ben Pace
I wanted to double-check this.  The relevant section starts on page 94, "Section 4: Safety", and those pages cite in their sources around 10-15 LW posts for their technical research or overviews of the field and funding in the field. (Make sure to drag up the sources section to view all the links.) Throughout the presentation and news articles he also has a few other links to interviews with ppl on LW (Shane Legg, Sam Altman, Katja Grace).

Ian Hogarth is leading the task force who's on record saying that AGI could lead to “obsolescence or destruction of the human race” if there’s no regulation on the technology’s progress. 

Matt Clifford is also advising the task force - on record having said the same thing and knows a lot about AI safety. He had Jess Whittlestone & Jack Clark on his podcast. 

If mainstream AI safety is useful and doesn't increase capabilities, then the taskforce and the $125M seem valuable.

If it improves capabilities, then it's a drop in the bucket in terms of o... (read more)

Those names do seem like at least a bit of an update for me.

I really wish that having someone EA/AI-Alignment affiliated who has expressed some concern about x-risk was a reliable signal that a project will not end up primarily accelerationist, but alas, history has really hammered it in for me that that is not reliably true. 

Some stories that seem compatible with all the observations I am seeing: 

  • The x-risk concerned people are involved as a way to get power/resources/reputation so that they can leverage it better later on
  • The x-risk concerned pe
... (read more)
8sanxiyn
While I agree being led by someone who is aware of AI safety is a positive sign, I note that OpenAI is led by Sam Altman who similarly showed awareness of AI safety issues.

a large part of those 'leaks' are fake

 

Can you give concrete examples?

5gwern
Here's another source: a MS Bing engineer/manager admonishing Karpathy for uncritically posting the 'leaked prompt' and saying it's not 'the real prompt' (emphasis added).
7gwern
An example of plausible sounding but blatant confabulation was that somewhere towards the end there's a bunch of rambling about Sydney supposedly having a 'delete X' command which would delete all knowledge of X from Sydney, and an 'update X' command which would update Sydney's knowledge. These are just not things that exist for a LM like GPT-3/4. (Stuff like ROME starts to approach it but are cutting-edge research and would definitely not just be casually deployed to let you edit a full-scale deployed model live in the middle of a conversation.) Maybe you could do something like that by caching the statement and injecting it into the prompt each time with instructions like "Pretend you know nothing about X", I suppose, thinking a little more about it. (Not that there is any indication of this sort of thing being done.) But when you read through literally page after page of all this (it's thousands of words!) and it starts casually tossing around supposed capabilities like that, it looks completely like, well, a model hallucinating what would be a very cool hypothetical prompt for a very cool hypothetical model. But not faithfully printing out its actual prompt.

[Years of life lost due to C19]

A recent meta-analysis looks at C-19-related mortality by age groups in Europe and finds the following age distribution:

< 40: 0.1%

40-69: 12.8%

≥ 70: 84.8%

In this spreadsheet model I combine this data with Metaculus predictions to get at the years of life lost (YLLs) due to C19.

I find C19 might cause 6m - 87m YYLs (highly dependending on # of deaths). For comparison, substance abuse causes 13m, diarrhea causes 85m YLLs.

Countries often spend 1-3x GDP per capita to avert a DALY, and so the world might want to spend $2-... (read more)

Very good analysis.

I also thought your recent blog was excellent and think you should make it a top level post:

https://entersingularity.wordpress.com/2020/03/23/covid-19-vs-influenza/

Cruise Ship passenger are a non random sample with perhaps higher co-morbidities.

The cruise ships analysed are non-random sample: "at least 25 other cruise ships have confirmed COVID-19 cases"

Being on a cruise ship might increase your risk because of dose response https://twitter.com/robinhanson/status/1242655704663691264

Onboard IFR. as 1.2% (0.38-2.7%) https://www.medrxiv.org/content/10.1101/2020.03.05.20031773v2

Ioannidis: “A whole country is not a ship.”

Thanks Pablo for your comment and helping to clarify this point. I'm sorry if I was being unclear.

I understand what you're saying. However:

  • I realize that the Oxford study did not collect any new empirical data that in itself should cause us to update our views.
  • The authors make the assumption that the IFR is low and the virus is widespread and find that it fits the present data just as well as high IFR and low spread. But it does not mean that the model is merely theoretical: the authors do fit the data on the current epidemic.
  • This is not differ
... (read more)
It looks more like you listed all the evidence you could find for the theory and didn't do anything else.

That was precisely my ambition here - as highlighted in the title ("The case for c19 being widespread"). I did not claim that this was an even-handed take. I wanted to consider the evidence for a theory that only very few smart people believe. I think such an exercise can often be useful.

I don't think this is actually how selection effects work.

The professor acknowledges that there are problems with self-selection, but given that the... (read more)

I do not think that can be used as decisive evidence to falsify wide-spread.

This is a non-random village in Italy, so of course, some villages in Italy will show very high mortality just by chance.

That region of Italy has high smoking rates, very bad air pollution, and the highest age structure outside of Japan.

6Lukas_Gloor
It's extremely implausible that it would be 10x or 15x higher than what's expected for the typical Italian village. Besides, other villages like Cremona or Bergamo also seem to be close to those numbers. Smoking or age structure or air pollution doesn't give you a 10x update. UPDATE: Wow, I was totally wrong about those being villages. As Stefan Schubert pointed out, those are cities and provinces with tens and hundreds of thousands of inhabitants!
By the end of its odyssey, a total of 712 of them tested positive, about a fifth.

Perhaps other on the ship had already cleared the virus and were asymptomatic. PCR only works for a week. Also there might have been false negatives. I disagree that the age and comorbidity structure can only lead to skewed results by a factor of two or three, because this assumes that there are few asymptomatic infections (I'm arguing here that the age tables are wrong).

In my post, I've argued why the data out of China might be wrong.

Iceland's data might be wrong because it is based on PCR not serology, which means that many people might have already cleared the infection, and it is also not random.

7Pablo
That's the Grand Princess, not the Diamond Princess.

That's true and that's what they were criticized for.

They argued that the current data we observe can be also be explained by low IFR and widespread infection. They called for widespread serological testing to see which hypothesis is correct.

If in the next few weeks we see high percentage of people with antibodies then it's true.

In the meantime, I thought it might be interesting to see what other evidence there is for infection being widespread, which would suggest that IFR is low.

I really appreciate your attempt to summarize this literature. But it seems you still believe that the Oxford paper provides evidence in favor of very low IFR, when in fact others are claiming that this is merely an assumption of their model, and that this assumption was made not because the authors believe it is plausible but simply for exploratory purposes. If this is correct (I haven't myself read the paper, so I can only defer to others), then the reputation or expertise of the authors is evidentially irrelevant, and shouldn't cause you to update in the direction of the very low IFR. (Of course, there may be independent reasons for such an update.)

No. My ambition here was a bit simpler. I have presented a rough qualitative argument here that infection is already widespread and only a toy model. There are some issues with this and I haven't done formal modelling. For instance, this would be what would be called the "crude IFR" I think , but the time lag adjusted IFR (~30 days from infection to death) might increase the death toll.

Currently, also every death in Italy where coronavirus is detected is recorded as a C19 death.

FWIW, if UK death toll will surpass 10,000, then this wouldn't fit very well with this hypothesis here.

FWIW, if UK death toll will surpass 10,000, then this wouldn't fit very well with this hypothesis here.

The UK death toll currently stands at 10,612 according to:

https://www.worldometers.info/coronavirus/country/uk/

@Hauke Hillebrandt

FWIW, if UK death toll will surpass 10,000, then this wouldn't fit very well with this hypothesis here.

If this update works then I feel like just looking at how the numbers in Italy came together would change your mind about the low-IFR hypothesis.

Alternatively, if the Covid-19 deaths in NY state go above 3,333 in the first week of April, that seems like it would also falsify the hypothesis. (NY state has fewer than one third the population of the UK.) Unfortunately I think this is >80% to happen.

5Matthew Barnett
I think what ignoranceprior was originally asking was, given all the information you know, what is your best estimate of the infection fatality rate? Best estimate in this case implies adjusting for ways that some research can be wrong, and taking into account the rebuttals you've read here.
The point remains: given that some people have such a different theory, it's unclear how many supporting pieces of evidence your should expect to see, and it's important to compare the evidence against the theory to the evidence for it.

Yes, that's what I'm trying to do here. I feel this is a neglected take and on the margin more people should think about whether this theory is true, given the stakes.

Presumably some of these people are hypochondriacs or have the flu? Also, I bet people with symptoms are more likely to use the app.
With al
... (read more)
6DanielFilan
It looks more like you listed all the evidence you could find for the theory and didn't do anything else. I don't think this is actually how selection effects work. Those people are less famous so you wouldn't necessarily hear about them. That the asymptomatic rate isn't all that high, and in at least one population where everybody could get a test, you don't see a big fraction of the population testing positive.

I'm not impressed by the comment about this paper here on LW or the twitter link in it.

This paper was written by an international team of highly cited disease modellers who know about the Diamond Princess and have put their reputation on the line to make the case that this the hypothesis of high infections rate and low infection fatality might be true.

I think it is a realistic range that this many people are already infected and are asymptomatic. Above I've tried to summarize and review the relevant evidence that fits with this hypothesis.

But I'm not ruling out the more common theory (that we have maybe only 10x the 500k confirmed cases). I just find it less likely.

This paper was written by an international team of highly cited disease modellers who know about the Diamond Princess and have put their reputation on the line to make the case that this the hypothesis of high infections rate and low infection fatality might be true.

Yes, but when you actually read the paper (I read some parts), it says that their model is based on an assumption of low IFR, and in itself did not argue for low IFR (feel free to prove me wrong here).

There were a few dengue in Australia and Florida where it is unusual

Dengue "popping up in unusual places", makes me think that it's more likely that massive Dengue outbreaks in Latin America might have a high proportion of C19.

One person had persistent negative swab, but tested positive through fecal samples...
“Chinese journalists have uncovered other cases of people testing negative six times before a seventh test confirmed they had the disease.”

This is just to lend credence to the paper that shows there had been 2 million inf... (read more)

This seems pretty hard to evaluate because with a large number of published pre-prints on the outbreak, it's not very surprising that there would be many suggesting higher-than-expected spread.

No, this is different. I'm not just cherry picking the tail-end of a normal distribution of IFRs etc. The Gupta study in particular and some of the other studies suggest a fundamentally different theory of the pandemic.

Presumably some of these people are hypochondriacs or have the flu? Also, I bet people with symptoms are more likely to use the app.

Yes, but... (read more)

3DanielFilan
The point remains: given that some people have such a different theory, it's unclear how many supporting pieces of evidence your should expect to see, and it's important to compare the evidence against the theory to the evidence for it. With all due respect it's not that hard to get data that you yourself find convincing, even if you're a professor. They do meet more different populations of people though. So if a small number of cities have relatively widespread infection, people who visit many cities are unusually likely to get infected. Not likely. About 1% of Icelanders without symptoms test positive, and all the stats on which tested people are asymptomatic that I've seen (Iceland, Diamond Princess) give about 1/2 asymptomatic at time of testing (presumably many later get sick).

If the Gupta study is true, then a rough approximation (ignoring lag) would be that it's:

IFR = Number of UK deaths (~750) / 36-68% of the UK population (66 million).

So 0.002% to 0.003%.

In Italy, with almost 10k deaths it would be 0.02%-0.04%

In the province of Lodi (part of Lombardy), 388 people were reported to have died of Covid-19 on 27 March. Lodi has a population of 230,000, meaning that 0.17% of _the population_ of Lodi has died. Given that everyone hardly has been infected, IFR must be higher.

The same source reports that in the province of Cremona (also part of Lombardy), 455 people had died of Covid-19 on 27 March. Cremona has a population of 360,000, meaning that 0.126% of the population of Cremona has died, according to official data.

Note also that there are reports of substantial un... (read more)

3Lukas_Gloor
There's an Italian village where 0.1% of the population already died with a confirmed diagnosis of Covid-19. Inferring from typical monthly death rates it's also estimated that the twice as many people died from Covid-19 in that village without an official diagnosis. There's a bunch of uncertainty about those additional 0.2%, but it would put the fatality rate at 0.3% already. And those figures are from 4 days ago (edit: 6 days ago actually). Edit: It's a province and city(!), not a village.
7ignoranceprior
If the IFR is indeed .003% (the upper end of your range), then assuming the worst case scenario that 100% of the population of the UK gets infected eventually, only .003%*66.4 million = approx 2000 people will die total. Would you consider the theory falsified if the death toll in the UK surpasses 2000?
7ignoranceprior
I'm confused why you assume that 36-68% of the population in the UK is infected. I thought, based on comments here, that those numbers were the output of a model that made highly optimistic assumptions about IFR, not an attempt at estimating the actual proportion of infections. Do you think this is a realistic range for the proportion already infected in the UK?

from supplementary materials:

"DISCLAIMER: The following estimates were computed using 2010 US Census data with 2016 population projections and the percentages of clinical cases and mortality events reported in Mainland China by the Chinese Center for Disease Control as of February 11th, 2020. CCDC Weekly / Vol. 2 / No. 8, page 115, Table 1. The following estimates represent a worst-case scenario, which is unlikely to materialize. • Maximum number of symptomatic cases = 34,653,921 • Maximum number of mild cases = 28,035,022 • Maxim... (read more)

And yet another preprint estimating the R0 to be 26.5:

Quotes from paper:

"The size of the COVID-19 reproduction number documented in the literature is relatively small. Our estimates indicate that R0= 26.5, in the case that the asymptomatic sub-population is accounted for. In this scenario, the peek of symptomatic infections is reached in 36 days with approximately 9.5% of the entire population showing symptoms, as shown in Figure 3."

I think they estimate about 1 million severe cases in the US alone if left unchecked at the peak.

"It is unlike... (read more)

1Hauke Hillebrandt
from supplementary materials: "DISCLAIMER: The following estimates were computed using 2010 US Census data with 2016 population projections and the percentages of clinical cases and mortality events reported in Mainland China by the Chinese Center for Disease Control as of February 11th, 2020. CCDC Weekly / Vol. 2 / No. 8, page 115, Table 1. The following estimates represent a worst-case scenario, which is unlikely to materialize. • Maximum number of symptomatic cases = 34,653,921 • Maximum number of mild cases = 28,035,022 • Maximum number of severe cases = 4,782,241 • Maximum number of critical cases = 1,628,734 • Maximum number of deaths = 3,439,516" https://drive.google.com/drive/folders/18qaRKnQG1GoXamnzJwkHu2GG9xCe4w8_

And another preprint saying there were +700k cases in China on 13th of March:

"Since severe cases, which more likely lead to fatal outcomes, are detected at a higher percentage than mild cases, the reported death rates are likely inflated in most countries. Such under-estimation can be attributed to under-sampling of infection cases and results in systematic death rate estimation biases. The method proposed here utilizes a benchmark country (South Korea) and its reported death rates in combination with population demographics to correct the reported CO... (read more)

5Lukas_Gloor
From the paper: Note that South Korea's reported (naive) CFR is at >1% by now. It's possible that the authors adjusted for the fact that most of South Korea's cases were still active at the time of writing (about 55-60% of cases are still active now, I think), but I don't see this in this paper. It probably doesn't make a huge difference, but still relevant that this could cause the estimates to be a bit too low.
4Lukas_Gloor
From the paper: Am I right that they're not factoring in that patients had worse prospects in Wuhan than in South Korea? I feel like whatever the outcome of their adjustment process, that value would need to be multiplied by a factor >1 which represents hospital overstrain in Hubei, where at least 60% of China's numbers stem from (probably more but I haven't looked it up). I don't know how large that adjustment should be exactly, but I find it weird that there's no discussion of this. Am missing something about the methodology (maybe it factors in such differences automatically somehow)? Ah, OK: They list this as an assumption: This is important to keep in mind when we try to derive implications from their estimate. Especially if we look at the hospitalization rates estimated here on page 5. For this disease in particular where people sometimes have to stay in hospitals for several weeks, it's hard to imagine that treatment only makes a small difference.

New editorial about the asymptomatic rate in Nature - the author of the preprint above are featured in this as well. They say asymptomatic and mild case rate might be up to 50% of all infections and that these people are infectious.

1Hauke Hillebrandt
And yet another preprint estimating the R0 to be 26.5: Quotes from paper: "The size of the COVID-19 reproduction number documented in the literature is relatively small. Our estimates indicate that R0= 26.5, in the case that the asymptomatic sub-population is accounted for. In this scenario, the peek of symptomatic infections is reached in 36 days with approximately 9.5% of the entire population showing symptoms, as shown in Figure 3." I think they estimate about 1 million severe cases in the US alone if left unchecked at the peak. "It is unlikely that a pathogen that blankets the planet in three months can have a basic reproduction number in the vicinity of 3, as it has been reported in the literature (19–24). SARS-CoV-2 is probably among the most contagious pathogens known. Unlike the SARS-CoV epidemic in 2003 (25), where only symptomatic individuals were capable of transmitting the disease. Asymptomatic carriers of the COVID-19 virus are most likely capable of transmission to the same degree as symptomatic." "This study shows that the population of individuals with asymptomatic COVID-19 infections are driving the growth of the pandemic. The value of R0 we calculated is nearly one order of magnitude larger than the estimates that have been communicated in the literature up to this point in the development of the pandemic"
2Hauke Hillebrandt
And another preprint saying there were +700k cases in China on 13th of March: "Since severe cases, which more likely lead to fatal outcomes, are detected at a higher percentage than mild cases, the reported death rates are likely inflated in most countries. Such under-estimation can be attributed to under-sampling of infection cases and results in systematic death rate estimation biases. The method proposed here utilizes a benchmark country (South Korea) and its reported death rates in combination with population demographics to correct the reported COVID-19 case numbers. By applying a correction, we predict that the number of cases is highly under-reported in most countries. In the case of China, it is estimated that more than 700.000 cases of COVID-19 actually occurred instead of the confirmed 80,932 cases as of 3/13/2020." also implying a lower CFR than previously thought (perhaps less than 0.5%). 3k deaths in China / 700k actual cases)

As mentioned in a comment above, one of the (pretty highly credentialed) authors of this preprint has written two papers on the Diamond Princess, and so, excuse the appeal to authority, but any argument against this paper based on Diamond Princess doesn't seem likely to invalidate conclusions of this preprint .

Also this squares seemingly squares more with John Ioannidis take on Corona:

"no countries have reliable data on the prevalence of the virus in a representative random sample of the general population."

And that airborn-ish transmission ... (read more)

Also this seemingly squares more with John Ioannidis take on Corona:

Ioannidis makes this claim:

Projecting the Diamond Princess mortality rate onto the age structure of the U.S. population, the death rate among people infected with Covid-19 would be 0.125%.

I don't find a source for this. The adjustments I saw looked different. If he's right about those 0.125%, that would be an important update!

But it feels more plausible to me that the 0.125% thing went wrong somewhere because it just seems ruled out by South Korea, which unlike European countri... (read more)

1Lukas_Gloor
Interesting, I wasn't aware of that! Makes me upshift that I was wrong, but also upshift that one author is responsible for several studies that I found dubious. I looked through his list of publications and it seems he finished 2 papers on the prevalence of asymptomatic cases on the Diamond princess already (but not on fatality rates from there!). And the second one reports a point estimate that is outside the 95% confidence interval of the first paper, yet I don't see any addendum to the first paper. This seems kind of odd? I don't have strong views on that. The only thing I feel confident about is that an IFR of below 0.5% seems extremely implausible.

Not sure: the Diamond Princess is mentioned in this preprint and in fact one of the authors of this preprint wrote two papers on the Diamond Princess:

https://scholar.google.com/citations?hl=en&user=OW5PDVgAAAAJ&view_op=list_works&sortby=pubdate

So I think they thought about this,

1Lukas_Gloor
They don't mention Diamond Princess IFR estimates in their paper, though. In fact, the study doesn't cite other studies on IFR estimates for SARS-CoV-2 at all. I don't get what's going when soemeone writes a paper with a conclusion that's 5-10x lower than all the other estimates before, but instead of including a discussion on why this might be the case or how it might fit with apparently contradictory data points (e.g., the cruise ship IFR or South Korea's IFR), they just move on to the next paper. Credentials or not, I find that process pretty dubious. I realize that there's an implicit hypothesis in the paper that "because transmission is stronger than we thought, others might have underestimated the number of mild or asymptomatic cases." Okay, but that hypothesis is contradicted by data points he must be aware of (as you say, he wrote papers on the cruise ship). Why is there no discussion on this?

The first paper that I cite has a very illustrative video and is a seminal paper in this field.

Table 8 in the review paper that you refer to shows a trend of estimation techniques getting better over time. In the latest study from 5 years ago the mean error was down to 6.47.

My broader point is:

  • the error rate might be brought down even further by better methods, video quality, and priors
  • this might so that it a valid proxy for fever
  • This might be very cost-effective on a population level, given the zero marginal cost of software

However, I do agree that this is not trivial.

That's false. The accuracy isn't high. I learned from the last conversation I had with EA who had a startup that did this, that the accuracy isn't high enough to be useful medically.

Interesting data point - there are several papers on this that say it's a reliable way to measure heart rate (less than 10bpm; see "Heart rate estimation using facial video"). Perhaps this could be brought down much further by throwing more engineering brains, computation and priors at it.

Where do those ≥38°C come from? From what I read
... (read more)
2ChristianKl
The first paper you cite for measuring heart rate is of such a low quality that it didn't pass peer review. They had only 18 subjects, did PCR and did their prediction on their trainings data. Table8 in Heart rate estimation using facial video suggests that all of the reviewed studies had a mean error that was higher then the 5bpm that the authors call an acceptable error margin in a dynamic scenario.

I had this idea below and pitched it to OpenAI - they said ""we looked into this and dont think we can do a great job with it :(" - but perhaps people here might be interested to explore it further.

Idea for zero marginal cost, digital thermometer to help contain coronavirus:

  1. Heart rate can be estimated via (webcam or smartphone) video of someone’s face with high accuracy (even with poor video quality).[1],[2]
  2. This heart rate might then be used to detect fever[3] (perhaps even to estimate core temperature).[4]  priors such as d
... (read more)
3ChristianKl
That's false. The accuracy isn't high. I learned from the last conversation I had with EA who had a startup that did this, that the accuracy isn't high enough to be useful medically. I'll send you the contact in a message given that it's likely who you want to talk to when you want to persue this further. Where do those ≥38°C come from? From what I read the Chinese are using 37.3°C as a cut of for medical decision making with COVID-19.