# Why COVID-19 death predictions will always be wrong

Following is a transcript of the video.

Narrator: At the end of March, the White House announced that it was predicting somewhere between 100,000 and 240,000 US deaths from COVID-19, a huge drop from a couple of weeks earlier, when the predictions were more like a couple million deaths. This led to a lot of confusion about where the White House's numbers came from and why the predictions shifted.

So let's break down where those numbers are coming from. To make any prediction about how many people will die from a virus, you have to first know how many people will get it, and that's why this number keeps coming up.

Emily Ricotta: The basic reproductive number, or R0, is, in a totally susceptible population, how many people would one person go on to infect.

Narrator: This is Emily Ricotta, a research fellow with the US National Institutes of Health.

Ricotta: So, if you drop one infected person into a totally susceptible population, how many more cases are you gonna see?

Narrator: As you can probably imagine, R0 is pretty important for predicting how bad an outbreak will be. If R0 is less than one, over time you'll end up with fewer and fewer new cases and the disease will die out. If R0 is more than one, that's when you can start to run into some problems, depending on how severe the illness is. Even an R0 of 2 gets more than 1,000 people infected only nine links down the chain.

Ricotta: So, the common cold. If it has an R0 of 2, how does that change policy? It doesn't, right? But if I have a pathogen that has an R0 of 2 and it's killing 1%, 10%, 50% of the people it infects, I'm going to respond much differently.

Narrator: R0 should be pretty simple to calculate. It's based on three main things: transmissibility, or how likely it is that you'll be infected through contact with someone who has the disease; average rate of contact, or how many people the average infected person will come in contact with over time; and finally duration of infectiousness, which is just how long the person spreading the disease is contagious for. Getting those factors involves a whole bunch of calculus that takes into account things like how many people at any given time are susceptible to infection and how many are actually infected. This is what some of the simplest equations look like. Of course, just having the equations isn't enough.

Ricotta: The thing that I want to emphasize about R0 is that it is very specific to the time of the outbreak, the place, the population. So there's never really just one R0 for a pathogen. Narrator: And those three factors? You just don't know that information with a new outbreak. Ricotta: The earlier that you are building these models in an outbreak, the harder it is, because you have more educated guesses and data that's not as specific to the outbreak as you'd want.

Narrator: Normally, scientists can estimate things like transmissibility based on data from previous outbreaks, but for the early predictions of the spread of COVID-19, all scientists had were the numbers from Wuhan, China, along with the data we've collected about other types of coronaviruses that infect humans.

Ricotta: What we do with it, especially at the early stages of an outbreak, is that we take data from the endemic coronaviruses, and we take data from what we saw in SARS, we take data from what we say in MERS, and we say, OK, let's see what happens if we make COVID, you know, spread the same as an endemic coronavirus. How many people does that infect? And then as we progress and we get more modern data, we start feeding that into the model and updating it as we go.

Narrator: The early estimates for R0, based on the initial outbreak in Wuhan, were 2.2 to 2.7, so more than the flu. Which brings us to the more detailed models for COVID-19 that we were talking about earlier.

Ricotta: R0 is gonna be different depending on where you are. So if I drop an infected person into the middle of New York City and I drop an infected person into the middle of rural America, R0s are gonna be two very different things because the number of people that are gonna come in contact with each other are very different.

Narrator: The report that predicted millions of deaths in the US alone, which was put together by researchers at the Imperial College London, used 2.4 as the average R0 for the coronavirus. That was based on transmission rates reported early on in China. From other data, they estimated things like the percentage of cases where the patient needed to be hospitalized. They predicted that 30% of those hospitalized would need critical care, like a ventilator, based on the rates among early cases.