Was Sturgis a Covid-19 Superspreader Event?: Evidence Suggests That It May Well Have Been

A.  Introduction

The Sturgis Motorcycle Rally is an annual 10-day event for motorcycle enthusiasts (in particular of Harley-Davidsons), held in the normally small town in far western South Dakota of Sturgis.  It was held again this year, from August 7 to August 16, despite the Covid-19 pandemic, and drew an estimated 460,000 participants.  Motorcyclists gather from around the country for lots of riding, lots of music, and lots of beer and partying.  And then they go home.  Cell phone data indicate that fully 61% of all the counties in the US were visited by someone who attended Sturgis this year.

Due to the pandemic, the town debated whether to host the event this year.  But after some discussion, it was decided to go ahead.  And it is not clear that town officials could have stopped it even if they wanted.  Riders would likely have shown up anyway.

Despite the on-going covid pandemic, masks were rarely seen.  Indeed, many of those attending were proud in their defiance of the standard health guidelines that masks should be worn and social distancing respected, and especially so in such crowded events.  T-shirts were sold, for example, declaring “Screw Covid-19, I Went to Sturgis”.

Did Sturgis lead to a surge in Covid-19 cases?  Unfortunately, we do not have direct data on this because the identification of the possible sources of someone’s Covid-19 infection is incredibly poor in the US.  There is little investigation of where someone might have picked up the virus, and far from adequate contact tracing.  And indeed, even those who attended the rally and later came down with Covid-19 found that their state health officials were often not terribly interested in whether they had been at Sturgis.  The systems were simply not set up to incorporate this.  And those attending who were later sick with the disease were also not always open on where they had been, given the stigma.

One is therefore left only with anecdotal cases and indirect evidence.  Recent articles in the Washington Post and the New York Times were good reports, but could only cover a number of specific, anecdotal, cases, as well as describe the party environment at Sturgis.  One can, however, examine indirect evidence.  It is reasonable to assume that those motorcycle enthusiasts who had a shorter distance to get to Sturgis from their homes would be more likely to go.  Hence near-by states would account for a higher share (adjusted for population) of those attending Sturgis and then returning home than would be the case for states farther away.  If so, then if Covid-19 was indeed spread among those attending Sturgis, one would see a greater degree of seeding of the virus that causes Covid-19 in the near-by states than would be the case among states that are farther away.  And those near-by states would then have more of a subsequent rise in Covid-19 cases as the infectious disease spread from person to person than one would see in states further away.

This post will examine this, starting with the chart at the top of this post.  As is clear in that chart, by early November states geographically closer to Sturgis had far higher cases of Covid-19 (as a share of their population) than those further away.  And the incidence fell steadily with geographic distance, in a relationship that is astonishingly tight.  Simply knowing the distance of the state from Sturgis would allow for a very good prediction (relative to the national average) of the number of daily new confirmed cases of Covid-19 (per 100,000 of population) in the 7-day period ending November 6.

A first question to ask is whether this pattern developed only after Sturgis.  If it had been there all along, including before the rally was held, then one cannot attribute it to the rally.  But we will see below that there was no such relationship in early August, before the rally, and that it then developed progressively in the months following.  This is what one would expect if the virus had been seeded by those returning from Sturgis, who then may have given this infectious disease to their friends and loved ones, to their co-workers, to the clerks at the supermarkets, and so on, and then each of these similarly spreading it on to others in an exponentially increasing number of cases.

To keep things simple in the charts, we will present them in a standard linear form.  But one may have noticed in the chart above that the line in black (the linear regression line) that provides the best fit (in a statistical sense) for a straight line to the scatter of points, does not work that well at the two extremes.  The points at the extremes (for very short distances and very long ones) are generally above the curve, while the points are often below in the middle range.  This is the pattern one would expect when what matters to the decision to ride to the rally is not some increment for a given distance (of an extra 100 miles, say), but rather for a given percentage increase (an extra 10%, say).  In such cases, a logarithmic curve rather than a straight (linear) line will fit the data better, and we will see below that indeed it does here.  And this will be useful in some statistical regression analysis that will examine possible explanations for the pattern.

It should be kept in mind, however, that what is being examined here are correlations, and being correlations one can not say with certainty that the cause was necessarily the Sturgis rally.  And we obviously cannot run this experiment over repeatedly in a lab, under varying conditions, to see whether the result would always follow.

Might there be some other explanation?  Certainly there could be.   Probably the most obvious alternative is that the surge in Covid-19 cases in the upper mid-west of the US between September and early November might have been due to the onset of cold weather, where the states close to Sturgis are among the first to turn cold as winter approaches in the US.  We will examine this below.  There is, indeed, a correlation, but also a number of counter-examples (with states that also turned colder, such as Maine and Vermont, that did not see such a surge in cases).  The statistical fit is also not nearly as good.

One can also examine what happened across the border in the neighboring provinces of Canada.  The weather there also turned colder in September and October, and indeed by more than in the upper mid-west of the US.  Yet the incidence of Covid-19 cases in those provinces was far less.

What would explain this?  The answer is that it is not cold weather per se that leads to the virus being spread, but rather cold weather in situations where socially responsible behavior is not being followed – most importantly mask-wearing, but also social distancing, avoidance of indoor settings conducive to the spread of the virus, and so on.  As examined in the previous post on this blog, mask-wearing is extremely powerful in limiting the spread of the virus that causes Covid-19.  But if many do not wear masks, for whatever reason, the virus will spread.  And this will be especially so as the weather turns colder and people spend more time indoors with others.

This could lead to the results seen if states that are geographically closer to Sturgis also have populations that are less likely to wear masks when they go out in public.  And we will see that this was likely indeed a factor.  For whatever reason (likely political, as the near-by states are states with high shares of Trump supporters), states geographically close to Sturgis have a generally lower share of their populations regularly wearing masks in this pandemic.  But the combination of low mask-wearing and falling temperatures (what statisticians call an interaction effect) was supplemental to, and not a replacement of, the impact of distance from Sturgis.  The distance factor remained highly significant and strong, including when controlling for October temperatures and mask-wearing, consistent with the view that Sturgis acted as a seeding event.

This post will take up each of these topics in turn.

B.  Distance to Sturgis vs. Daily New Cases of Covid-19 in the Week Ending November 6

The chart at the top of this post plots the average daily number of confirmed new cases of Covid-19 over the 7-day period ending November 6 in a state (per 100,000 of population), against the distance to Sturgis.  The data for the number of new cases each day was obtained from USAFacts, which in turn obtained the data from state health authorities.  The data on distance to Sturgis was obtained from the directions feature on Google Maps, with Sturgis being the destination and the trip origin being each of the 48 states in the mainland US (Hawaii and Alaska were excluded), plus Washington, DC.  Each state was simply entered (rather than a particular address within a state), and Google Maps then defaulted to a central location in each state.  The distance chosen was then for the route recommended by Google, in miles and on the roads recommended.  That is, these are trip miles and not miles “as the crow flies”.

When this is done, with a regular linear scale used for the mileage on the recommended routes, one obtains the chart at the top of this post.  For the week ending November 6, those states closest to Sturgis saw the highest rates of Covid-19 new cases (130 per 100,000 of population in South Dakota itself, where Sturgis is in the far western part of the state, and 200 per 100,000 in North Dakota, where one should note that Sturgis is closer to some of the main population centers of North Dakota than it is to some of the main population centers of South Dakota).  And as one goes further away geographically, the average daily number of new cases falls substantially, to only around one-tenth as much in several of the states on the Atlantic.

The model is a simple one:  The further away a state is from Sturgis, the lower its rate (per 100,000 of population) of Covid-19 new cases in the first week of November.  But it fits extremely well even though it looks at only one possible factor (distance to Sturgis).  The straight black line in the chart is the linear regression line that best fits, statistically, the scatter of points.  A statistical measure of the fit is called the R-squared, which varies between 0% and 100% and measures what share of the variation observed in the variable shown on the vertical axis of the chart (the daily new cases of Covid-19) can be predicted simply by knowing the regression line and the variable shown on the horizontal axis (the miles to Sturgis).

The R-squared for the regression line calculated for this chart was surprisingly high, at 60%.  This is astonishing.  It says that if all we knew was this regression line, then we could have predicted 60% of the variation in Covid-19 cases across states in the week ending November 6 simply by knowing how far the states are from Sturgis.  States differ in numerous ways that will affect the incidence of Covid-19 cases in their territory.  Yet here, if we know just the distance to Sturgis, we can predict 60% of how Covid-19 incidence will vary across the states.  Regressions such as these are called cross-section regressions (the data here are across states), and such R-squares are rarely higher than 20%, or at most perhaps 30%.

But as was discussed above in the introduction, trip decisions involving distances often work better (fit the data better) when the scale used is logarithmic.  On a logarithmic scale, what enters into the decision to make the trip of not is not some fixed increment of distance (e.g. an extra 100 miles) but rather some proportional change (e.g. an extra 10%).  A statistical regression can then be estimated using the logarithms of the distances, and when this estimated line is re-calculated back on to the standard linear scale, one will have the curve shown in blue in the chart:

The logarithmic (or log) regression line (in blue) fits the data even better than the simple linear regression line (in black), including at the two extremes (very short and very long distances).  And the R-squared rises to 71% from the already quite high 60% of the linear regression line.  The only significant outlier is North Dakota.  If one excludes North Dakota, the R-squared rises to 77%.  These are remarkably high for a cross-section analysis.

This simple model therefore fits the data well, indeed extremely well.  But there are still several issues to consider, starting with whether there was a similar pattern across the states before the Sturgis rally.

C.  Distance to Sturgis vs. Daily New Cases of Covid-19 in the Week Ending August 6, and the Progression in Subsequent Months

The Sturgis rally began on August 7.  Was there possibly a similar pattern as that found above in Covid-19 cases before the rally?  The answer is a clear no:

In the week ending August 6, the relationship of Covid-19 cases to distance from Sturgis was about as close to random as one can ever find.  If anything, the incidences of Covid-19 cases in the 10 or so states closest to Sturgis were relatively low.  And for all 48 states of the Continental US (plus Washington, DC), the simple linear regression line is close to flat, with an R-squared of just 0.4%.  This is basically nothing, and is in sharp contrast to the R-squared for the week ending November 6 of 60% (and 71% in logarithmic terms).

One should also note the magnitudes on the vertical scale here.  They range from 0 to 40 cases (per 100,000 of population) per day in the 7-day period.  In the chart for cases in the 7-day period ending on November 6 (as at the top of this post), the scale goes from 0 to 200.  That is, the incidence of Covid-19 cases was relatively low across US states in August (relative to what it was later in parts of the US).  That then changed in the subsequent months.  Furthermore, one can see in the charts above for the week ending November 6 that the states further than around 1,400 miles from Sturgis still had Covid new case rates of 40 per day or less.  That is, the case incidence rates remained in that 0 to 40 range between August and early November for the states far from Sturgis.  The states where the rates rose above this were all closer to Sturgis.

There was also a steady progression in the case rates in the months from August to November, focused on the states closer to Sturgis, as can be seen in the following chart:

Each line is the linear regression line found by regressing the number of Covid-19 cases in each state (per 100,000 of population) for the week ending August 6, the week ending September 6, the week ending October 6, and the week ending November 6, against the geographic distance to Sturgis.  The regression lines for the week ending August 6 and the week ending November 6 are the same as discussed already in the respective charts above.  The September and October ones are new.

As noted before, the August 6 line is essentially flat.  That is, the distance to Sturgis made no difference to the number of cases, and they are also all relatively low.  But then the line starts to twist upwards, with the right end (for the states furthest from Sturgis) more or less fixed and staying low, while the left end rotated upwards.  The rotation is relatively modest for the week ending September 6, is more substantial in the month later for the week ending October 6, and then the largest in the month after that for the week ending November 6.  This is precisely the path one would expect to find with an exponential spread of an infectious disease that has been seeded but then not brought under effective control.

D.  Might Falling Temperatures Account for the Pattern?

The charts above are consistent with Sturgis acting as a seeding event that later then led to increases in Covid-19 cases that were especially high in near-by states.  But one needs to recognize that these are just correlations, and by themselves cannot prove that Sturgis was the cause.  There might be some alternative explanation.

One obvious alternative would be that the sharp increase in cases in the upper mid-west of the US in this period was due to falling temperatures, as the northern hemisphere winter approached.  These areas generally grow colder earlier than in other parts of the US.  And if one plots the state-wide average temperatures in October (as reported by NOAA) against the average number of Covid-19 cases per day in the week ending November 6 one indeed finds:

There is a clear downward trend:  States with lower average temperatures in October had more cases (per 100,000 of population) in the week ending November 6.  The relationship is not nearly as tight as that found for the one based on geographic distance from Sturgis (the R-squared is 35% here, versus 60% for the linear relationship based on distance), but 35% is still respectable for a cross-state regression such as this.

However, there are some counterexamples.  The average October temperatures in Maine and Vermont were colder than all but 7 or 10 states (for Maine and Vermont, respectively), yet their Covid-19 case rates were the two lowest in the country.

More telling, one can compare the rates in North and South Dakota (with the two highest Covid-19 rates in the country in the week ending November 6) plus Montana (adjacent and also high) with the rates seen in the Canadian provinces immediately to their north:

The rates are not even close.  The Canadian rates were all far below those in the US states to their south.  The rate in North Dakota was fully 30 times higher than the rate in Saskatchewan, the Canadian province just to its north.  There is clearly something more than just temperature involved.

E.  The Impact of Wearing Masks, and Its Interaction With Temperature

That something is the actions followed by the state or provincial populations to limit the spread of the virus.  The most important is the wearing of masks, which has proven to be highly effective in limiting the spread of this infectious disease, in particular when complemented with other socially responsible behaviors such as social distancing, avoiding large crowds (especially where many do not wear masks), washing hands, and so on.  Canadians have been far more serious in following such practices than many Americans.  The result has been far fewer cases of Covid-19 (as a share of the population) in Canada than in the US, and far fewer deaths.

Mask wearing matters, and could be an alternative explanation for why states closer to Sturgis saw higher rates of Covid-19 cases.  If a relatively low share of the populations in the states closer to Sturgis wear masks, then this may account for the higher incidence of Covid-19 cases in those near-by states.  That is, perhaps the states that are geographically closer to Sturgis just happen also to be states where a relatively low share of their populations wear masks, with this then possibly accounting for the higher incidence of cases in those states.

However, mask-wearing (or the lack of it), by itself, would be unlikely to fully account for the pattern seen here.  Two things should be noted.  First, while states that are geographically closer to Sturgis do indeed see a lower share of their population generally wearing masks when out in public, the relationship to this geography is not as strong as the other relationships we have examined:

The data in the chart for the share who wear masks by state come from the COVIDCast project at Carnegie Mellon University, and was discussed in the previous post on this blog.  The relationship found is indeed a positive one (states geographically further from Sturgis generally have a higher share of their populations wearing masks), but there is a good deal of dispersion in the figures and the R-squared is only 27.5%.  This, by itself, is unlikely to explain the Covid-19 rates across states in early November.

Second, and more importantly:  While the states closer to Sturgis generally have a lower share of mask-wearing, this would not explain why one did not see similarly higher rates of Covid-19 incidence in those states in August.  Mask-wearing was likely similar.  The question is why did Covid-19 incidence rise in those states between August (following the Sturgis rally) and November, and not simply why they were high in those states in November.

However, mask-wearing may well have been a factor.  But rather than accounting for the pattern all by itself, it may have had an indirect effect.  With the onset of colder weather, more time would be spent with others indoors, and wearing a mask when in public is particularly important in such settings.  That is, it is the combination of both a low share of the population wearing masks and the onset of colder weather which is important, not just one or the other.

These are called interaction effects, and investigating them requires more than can be depicted in simple charts.  Multiple regression analysis (regression analysis with several variables – not just one as in the charts above) can allow for this.  Since it is a bit technical, I have relegated a more detailed discussion of these results to a Technical Annex at the conclusion of this post for those who are interested.

Briefly, a regression was estimated that includes miles from Sturgis, average October temperatures, the share who wear masks when out in public, plus an interaction effect between the share wearing masks and October temperatures, all as independent variables affecting the observed Covid-19 case rates of the week ending November 6.  And this regression works quite well.  The R-squared is 75.4%, and each of the variables (including the interaction term) are either highly significant (miles from Sturgis) or marginally so (a confidence level of between 6 and 8% for the variables, which is slightly worse than the 5% confidence level commonly used, but not by much).

Note in particular that the interaction term matters, and matters even while each of the other variables (miles to Sturgis, October temperatures, and mask-wearing) are taken into account individually as well.  In the interaction term, it is not simply the October temperatures or the share wearing masks that matter, but the two acting together.  That is, the impact of relatively low temperatures in October will matter more in those states where mask-wearing is low than they would in states where mask-wearing is high.  If people generally wore masks when out in public (and followed also the other socially responsible behaviors that go along with it), the falling temperatures would not matter as much.  But when they don’t, the falling temperatures matter more.

From this overall regression equation, one can also use the coefficients found to estimate what the impact would be of small changes in each of the variables.  These are called elasticities, and based on the estimated equation (and computing the changes around the sample means for each of the variables):  a 1% reduction in the number of miles from Sturgis would lead to a 1.0% rise in the incidence of Covid-19 cases; a 1% reduction (not a 1 percentage point increase, but rather a 1% reduction from the sample mean) in the share of the population wearing masks when out in public would lead to a 1.7% rise in the incidence of Covid-19 cases; and a 1% reduction in the average October temperature across the different states would lead to a 1.2% rise in the incidence of Covid-19 cases.  All of these elasticity estimates look quite plausible.

These results are consistent with an explanation where the Sturgis rally acted as a significant superspreader event that led to increased seeding of the virus in the locales, in near-by states especially. This then led to significant increases in the incidence of Covid-19 cases in the different states as this infectious disease spread to friends and family and others in the subsequent months, and again especially in the states closest to Sturgis.  Those increases were highest in the states that grew colder earlier than others when the populations wearing masks regularly in those states was relatively low.  That is, the interaction of the two mattered.  But even with this effect controlled for, along with controlling also for the impact of colder temperatures and for the impact of mask-wearing, the impact of miles to Sturgis remained and was highly significant statistically.

F.  Conclusion

As noted above, the analysis here cannot and does not prove that the Sturgis rally acted as a superspreader event.  There was only one Sturgis rally this year, one cannot run repeated experiments of such a rally under various alternative conditions, and the evidence we have are simply correlations of various kinds.  It is possible that there may be some alternative explanation for why Covid-19 cases started to rise sharply in the weeks after the rally in the states closest to Sturgis.  It is also possible it is all just a coincidence.

But the evidence is consistent with what researchers have already found on how the virus that causes Covid-19 is spread.  Studies have found that as few as 10% of those infected may account for 80% of those subsequently infected with the virus.  And it is not just the biology of the disease and how a person reacts to it, but also whether the individual is then in situations with the right conditions to spread it on to others.  These might be as small as family gatherings, or as large as big rallies.  When large numbers of participants are involved, such events have been labeled superspreader events.

Among the most important of conditions that matter is whether most or all of those attending are wearing masks.  It also matters how close people are to each other, whether they are cheering, shouting, or singing, and whether the event is indoors or outdoors.  And the likelihood that an attendee who is infectious might be there increases exponentially with the number of attendees, so the size of the gathering very much matters.

A number of recent White House events matched these conditions, and a significant number of attendees soon after tested positive for Covid-19.  In particular, about 150 attended the celebration on September 26 announcing that Amy Coney Barrett would be nominated to the Supreme Court to take the seat of the recently deceased Ruth Bader Ginsburg.  Few wore masks, and at least 18 attendees later tested positive for the virus.  And about 200 attended an election night gathering at the White House.  At least 6 of those attending later tested positive.  While one can never say for sure where someone may have contracted the virus, such clusters among those attending such events are very unlikely unless the event was where they got the virus.  It is also likely that these figures are undercounts, as White House staff have been told not to let it become publicly known if they come down with the virus.  Finally, as of November 13 at least 30 uniformed Secret Service officers, responsible for security at the White House, have tested positive for the coronavirus in the preceding few weeks.

There is also increasing evidence that the Trump campaign rallies of recent months led to subsequent increases in Covid-19 cases in the local areas where they were held.  These ranged from studies of individual rallies (such as 23 specific cases traced to three Trump rallies in Minnesota in September), to a relatively simple analysis that looked at the correlation between where Trump campaign rallies were held and subsequent increases in Covid-19 cases in that locale, to a rigorous academic study that examined the impact of 18 Trump campaign rallies on the local spread of Covid-19.  This academic study was prepared by four members of the Department of Economics at Stanford (including the current department chair, Professor B. Douglas Bernheim).  They concluded that the 18 Trump rallies led to an estimated extra 30,000 Covid-19 cases in the US, and 700 additional deaths.

One should expect that the Sturgis rally would act as even more of a superspreader event than those campaign rallies.  An estimated 460,000 motorcyclists attended the Sturgis rally, while the campaign rallies involved at most a few thousand at each.  Those at the Sturgis rally could also attend for up to ten days; the campaign rallies lasted only a few hours.  Finally, there would be a good deal of mixing of attendees at the multiple parties and other events at Sturgis.  At a campaign rally, in contrast, people would sit or stand at one location only, and hence only be exposed to those in their immediate vicinity.

The results are also consistent with a rigorous academic study of the more immediate impact of the Sturgis rally on the spread of Covid-19, by Professor Joseph Sabia of San Diego State University and three co-authors.  Using anonymous cell phone tracking data, they found that counties across the US that received the highest inflows of returning participants from the Sturgis rally saw, in the immediate weeks following the rally (up to September 2), an increase of 7.0 to 12.5% in the number of Covid-19 cases relative to the counties that did not contribute inflows.  But their study (issued as a working paper in September) looked only at the impact in the immediate few weeks following Sturgis.  They did not consider what such seeding might then have led to.  The results examined in the analysis here, which is longer-term (up to November 6), are consistent with their findings.

It is therefore fully plausible that the Sturgis rally acted as a superspreader event.  And the evidence examined in this post supports such a conclusion.  While one cannot prove this in a scientific sense, as noted above, the likelihood looks high.

Finally, as I finish writing this, the number of deaths in the US from this terrible virus has just surpassed 250,000.  The number of confirmed cases has reached 11.6 million, with this figure rising by 1 million in just the past week.  A tremendous surge is underway, far surpassing the initial wave in March and April (when the country was slow to discover how serious the spread was, due in part to the botched development in the US of testing for the virus), and far surpassing also the second, and larger, wave in June and July (when a number of states, in particular in the South and Southwest, re-opened too early and without adequate measures, such as mask mandates, to keep the disease under control).  Daily new Covid-19 cases are now close to 2 1/2 times what they were at their peak in July.

This map, published by the New York Times (and updated several times a day) shows how bad this has become.  It is also revealing that the worst parts of the country (the states with the highest number of cases per 100,000 of population) are precisely the states geographically closest to Sturgis.  There is certainly more behind this than just the Sturgis rally.  But it is highly likely the Sturgis rally was a significant contributor.  And it is extremely important if more cases are to be averted to understand and recognize the possible role of events such as the rally at Sturgis.

Average Daily Cases of Covid-19 per 100,000 Population

7-Day Average for Week Ending November 18, 2020

Source:  The New York Times, “Covid in the US:  Latest Map and Case Count”.  Image from November 19, with data as of 8:14 am.

 


Technical Annex:  Regression Results

As discussed in the text, a series of regressions were estimated to explore the relationship between the Sturgis rally and the incidence of Covid-19 cases (the 7-day average of confirmed new cases in the week ending November 6) across the states of the mainland US plus Washington, DC.  Five will be reported here, with regressions on the incidence of Covid-19 cases (as the dependent variable) as a function of various combinations of three independent variables: miles from Sturgis (in terms of their natural logarithms), the average state-wide temperature in October (also in terms of their natural logarithms), and the share of the population in the respective states who reported they always or most of the time wore masks when out in public.  Three of the five regressions are on each of the three independent variables individually, one on the three together, and one on the three together along with an interaction effect measured by multiplying the October temperature variable (in logs) with the share wearing masks.  The sources for each variable were discussed above in the main text.

The basic results, with each regression by column, are summarized in the following table:

Regressions on State Covid-9 Cases – November 6

     Miles to Sturgis and Temperatures are in natural logs

Miles only

Temp only

Masks only

Miles, Temp, &Masks

All with Interaction

Miles to Sturgis

Slope

-54.9

-41.9

-36.6

t-statistic

-10.7

-5.2

-4.3

Avg Temperature

Slope

-133.3

-45.5

-516.8

t-statistic

-5.5

-2.0

-1.9

Share Wear Masks

Slope

-3.1

-0.8

-22.4

t-statistic

-3.9

-1.3

-1.8

Interaction Temp & Masks

Slope

5.44

t-statistic

1.8

Intercept

425.5

572.5

309.4

582.5

2,422.5

t-statistic

11.9

6.0

4.5

7.1

2.3

R-squared

71.0%

39.4%

24.2%

73.7%

75.4%

In the regressions with each independent variable taken individually, all the coefficients (slopes) found are highly significant.  The general rule of thumb is that a confidence level of 5% is adequate to call the relationship statistically “significant” (i.e. that the estimated coefficient would not differ from zero just due to random variation in the data).  A t-statistic of 2.0 or higher, in a large sample, would signal significance at least at a 5% confidence level (that is, that the estimated coefficient differs from zero at least 95% of the time), and the t-statistics are each well in excess of 2.0 in each of the single-variable regressions.  The R-squared is quite high, at 71.0%, for the regression on miles from Sturgis, but more modest in the other two (39.4% and 24.2% for October temperature and mask-wearing, respectively).

The estimated coefficients (slopes) are also all negative.  That is, the incidence of Covid-19 goes down with additional miles from Sturgis, with higher October temperatures, and with higher mask-wearing.  The actual coefficients themselves should not be compared to each other for their relative magnitudes.  Their size will depend on the units used for the individual measures (e.g. miles for distance, rather than feet or kilometers; or temperature measured on the Fahrenheit scale rather than Centigrade; or shares expressed as, say, 80 for 80% instead of 0.80).  The units chosen will not matter.  Rather, what is of interest is how the predicted incidence of Covid-19 changes when there is, say, a 1% change in any of the independent variables.  These are elasticities and will be discussed below.

In the fourth regression equation (the fourth column), where the three independent variables are all included, the statistical significance of the mask-wearing variable drops to a t-statistic of just 1.3.  The significance of the temperature variable also falls to 2.0, which is at the borderline for the general rule of thumb of 5% confidence level for statistical significance.  The miles from Sturgis variable remains highly significant (its t-statistic also fell, but remains extremely high).  If one stopped here, it would appear that what matters is distance from Sturgis (consistent with Sturgis acting as a seeding event), coupled with October temperatures falling (so that the thus seeded virus spread fastest where temperatures had fallen the most).

But as was discussed above in the main text, there is good reason to view the temperature variable acting not solely by itself, but in an interaction with whether masks are generally worn or not.  This is tested in the fifth regression, where the three individual variables are included along with an interaction term between temperatures and mask-wearing.  The temperature, mask-wearing, and interaction variables now all have a similar level of significance, although at just less than 5% (at 6% to 8% for each).  While not quite 5%, keep in mind that the 5% is just a rule of thumb.  Note also that the positive sign on the interaction term (the 5.44) is an indication of curvature.  The positive sign, coupled with the negative signs for the temperature and mask-wearing variables taken alone, indicates that the curves are concave facing upwards (the effects of temperature and mask-wearing diminish at the margin at higher values for the variables).  Finally, the miles to Sturgis variable remains highly significant.

Based on this fifth regression equation, with the interaction term allowed for, what would be the estimated response of Covid-19 cases to changes in any of the independent variables (miles to Sturgis, October temperatures, and mask-wearing)?  These are normally presented as elasticities, with the predicted percentage change in Covid-19 cases when one assumes a small (1%) change in any of the independent variables.  In a mixed equation such as this, where some terms are linear and some logarithmic (plus an interaction term), the resulting percentage change can vary depending on the starting point is chosen.  The conventional starting point taken is normally the sample means, and that will be done here.

Also, I have expressed the elasticities here in terms of a 1% decrease in each of the independent variables (since our interest is in what might lead to higher rates of Covid-19 incidence):

Elasticities from Full Equation with Interaction Term

      Percent Increase in Number of Covid-19 Cases from a 1% Decrease Around Sample Means

Elasticity

Miles to Sturgis

1.02%

October Temperature

1.16%

Share Wearing Masks

1.69%

All these estimated elasticities are quite plausible.  If one is 1% closer in geographic distance to Sturgis (starting at the sample mean, and with the other two variables of October temperature and mask-wearing also at their respective sample means), the incidence of Covid-19 cases (per 100,000 of population) as of the week ending November 6 would increase by an estimated 1.02%.  A 1% lower October temperature (from the sample mean) would lead to an estimated 1.16% increase in Covid-19 cases.  And the impact of the share wearing masks is important and stronger, where a 1% reduction in the share wearing masks would lead to an estimated 1.69% increase in cases, with all the other factors here taken into account and controlled for.

These results are consistent with a conclusion that the Sturgis rally led to a significant seeding of cases, especially in near-by states, with the number of infections then growing over time as the disease spread.  The cases grew faster in those states where mask-wearing was relatively low, and in states with lower temperatures in October (leading people to spend more time indoors).  When the falling temperatures were coupled with a lower share (than elsewhere) of the population wearing masks, the rate of Covid-19 cases rose especially fast.

More Evidence on the Effectiveness of Masks in Limiting the Spread of Covid-19

A.  Introduction

States where a high share of the population normally wear face masks when out in public also have a significantly lower transmission of the virus that causes Covid-19.  The chart above shows the relationship between the wearing of face masks and the prevalence of Covid-19 in the community (measured in ways that will be discussed below).  It is remarkable how tight that relationship is, as well as how steep.  Wearing masks has a large effect.  States differ between each other in dozens of different ways that can significantly affect the transmission of Covid-19.  Yet the share of the population who report that they wear face masks most or all of the time when they go out in public can explain by itself most of the variation in the prevalence of Covid-19 across the states.

The data also show a remarkably strong consistency between the share of the population in a state that wear masks and whether that state voted for Clinton or Trump in 2016.  That there is such a relationship is not surprising.  Bur what is surprising is that the relationship is close to perfect.  All but one of the states that voted for Clinton in 2016 report a mask-wearing share of 88% or above.  The one exception is Colorado, with a share of 87.4%.  And every single Trump-voting state has a reported share that is below 88%.  Furthermore, several of the states where the vote margin was close (and where current polling indicates Biden would receive the most votes) are on the borderline.  Such states include Pennsylvania, Michigan, and Wisconsin, each with a share between 87 and 88%.

This post will explain where this data comes from, the statistical significance of the relationships, and how one can appropriately interpret the results – for the chart above and two more below.  And I should note that the idea for a chart similar to that above, using this data set, came from an article by the Washington Post reporter Christopher Ingraham that appeared on October 23 at the Washington Post website.  The analysis here extends what Ingraham had.

B.  A Higher Share of People Wearing Masks is Associated With A Lower Incidence of Covid-19 in the Community

The chart at the top of this post shows a remarkably tight relationship between the share of the population who say they normally or always wear a mask when out in public, and the prevalence of Covid-19 in those states (or more precisely, the share of the population who are personally aware of someone in the local community with Covid-19 like symptoms – this will be discussed below).  With a higher share wearing masks, the prevalence is lower.  There are qualifiers that need to be considered on the source of the data and how one should interpret the apparent relationship, but that there is such an association is clear.

The data underlying the analysis comes from a new set assembled as part of the COVIDcast project at Carnegie Mellon University.  With the onset of the Covid-19 crisis, this group at Carnegie Mellon designed a simple survey that participants could sign on to via Facebook, to provide data on the spread of Covid-19.  While the questionnaire has evolved over time, the most recent version (that they call Wave 4) was launched on September 8, and includes questions on mask usage.  What makes the survey particularly interesting is that they receive a huge number of responses daily (averaging over 40,000 per day from September 8 to October 7).  This allows for a statistically significant sample at not just the state level (which I focus on here), but also for most counties in the US.

There are, of course, potential biases in such a sample that must be corrected for.  Those using Facebook, and in particular those willing to participate in such a survey seen via Facebook, will not necessarily be representative of the population.  But the Carnegie-Mellon analysts use various methods, including adjusting for the demographic characteristics of the respondents, to correct for this.  It cannot be perfect, but is likely to be reasonable.

One should also recognize that the behavior respondents record and what they actually do (such as on mask usage) may differ.  Respondents may exaggerate the consistency with which they in fact use masks.  But the Carnegie Mellon researchers have compared their results with that found from other sources, and have concluded they are consistent.  Furthermore, if there is a bias, one might expect that bias to be similar across states.  Perhaps all the responses (on, say, mask usage) are biased upwards – we may all say that we use masks more frequently than we in fact do.  But if that bias is similar (on average) across all of us, then the variation across states would remain.  They would just all be shifted upwards.  Still, one should remain cognizant that the findings are based on self-reported responses, and may be biased.

The Wave 4 questionnaire had questions on a variety of topics.  The specific question on mask usage was whether, in the past five days, the respondent had worn a mask when in public:  all of the time, most of the time, some of the time, a little of the time, or none of the time.  A mask wearer was classified as one who said that they wore a mask all or most of the time.

For whether the respondent might have Covid-19, the questionnaire asked whether they or someone in their immediate household suffer from Covid-like symptoms – specifically, whether they have a fever of 100℉ or more plus at least one of several additional possible conditions (sore throat, cough, shortness of breath, or difficulty breathing).  Thus, while they also ask later whether the person has had a formal test for Covid-19 (they may or may not have), the response reported here is for whether they have Covid-like symptoms.  Similarly, the figure for the share reporting possible cases of Covid-19 in the community (as in the chart at the top of this post), is based on whether the respondent was aware of others in their local community – who they know personally – who are suffering from Covid-19 like symptoms (with the conditions as defined for the individual).

The survey was designed this way in part as a purpose was to see whether such self-reported conditions could help local health authorities determine whether Covid-19 might be spreading in their communities, and to know this even before testing might find it.  And the results were encouraging.  The Carnegie Mellon researchers found that the daily and highly localized monitoring that was possible with the extremely large sample size of their survey generally performed well in tracking what was later found, via confirmed tests, on the spread of Covid-19 in that locality.

The resulting relationship between the respondents reporting that they wore masks when out in public all or most of the time (in the past five days), and the share reporting that they were personally aware of people in their community exhibiting Covid-19 like symptoms, is what is plotted (in terms of state averages) in the chart above.  To smooth out possible day to day statistical noise in the data (and also to be consistent with 7-day averages for reported confirmed cases of Covid-19, to be discussed below), the data shown in the chart is for the 7-day average covering October 15 to October 21 (the most recent days available when I downloaded this).

The straight line in black in the chart is the ordinary least squares regression line – the line that best fits the scatter of observations.  And from this one can calculate the statistical measure commonly referred to as the R-squared, which can vary between 0 and 1 (or 0% to 100%).  The R-squared indicates what share of the variation in the scatter of observations would be predicted by simply knowing where this straight regression line passes.  If the scatter points are all close to that line, the R-squared will be high.  In the limit, if they all lie precisely at that line, the R-squared will equal 1.  At the other extreme, if the scatter is all over and basically random, then the R-squared will be close to 0.

R-squared values are normally low for what are termed cross-section analyses (such as this, i.e. across the different states).  There are numerous reasons states differ from each other, and just knowing one factor (in this case the share who wear masks) will normally produce only a loose correlation with the result of interest (in this case the share reporting they are personally aware of people with Covid-19 like symptoms in the community).  Economists and other analysts would normally be happy to find a R-squared of 20% or so in such cross-state analyses, and elated if it is 30%.

In the chart here, the R-squared was 66%.  This is remarkable.  It indicates that if all one knows is the share of those wearing masks, we could predict 66% of the variation in the share reporting that they are aware of Covid-19 like symptoms in the community.  Despite the many reasons why states may differ in their incidence of Covid-19, this one factor (the share of those wearing masks) will by itself predict two-thirds of the variation.  Furthermore, one state (Wyoming) is an outlier.  If one runs the regression over the full sample but with this single case removed, the R-squared rises to an astonishing 76%.

There are further reasons to be surprised that such a strong statistical relationship comes through.  One is that the data come from a survey.  Poor (possibly misunderstood) responses, or lack of knowledge on whether others in the community are suffering from Covid-19 like symptoms (due, perhaps, to not knowing many in the community, or not being in touch with them) will normally add statistical noise.   But it appears that the extremely large sample sizes here have offset that.  We still see a clear and strong relationship.

One should also recognize that states in the US are not isolated from each other.  There is a substantial amount of travel from one to the other.  Thus even if mask-wearing is common in one state, with infection rates then low, there may be a continual “re-seeding” of the infection brought in by travelers from states that are not as conscientious in wearing masks.  This would weaken the relationship between local mask-wearing and local infection rates.  Yet despite this, we still see a strong and highly significant effect.

One must also always note that what is being examined is a correlation between two variables, and that correlation does not necessarily indicate causation.  One must examine whether it may in each individual analysis.  In the case here, however, one can readily see a mechanism where a higher share of the population wearing masks will lead to a lower share of the population in the community being infected with the virus that causes Covid-19.  But what would be the mechanism where a higher incidence of Covid-19 in the community would affect the share wearing masks?  There might well be such a causal relationship, but one would then expect it to act in precisely the opposite way to the relationship found in the data:  When a high share of the local community is infected with Covid-19, one would expect a high share of the population then to wear masks.  It would be rational to be extra careful.  But the relationship seen in the data is the opposite:  The data show that a high share of the community being infected is associated with a low share of the population wearing masks.  The line slopes downwards.  It is reasonable to conclude that the causation goes from the wearing of masks to the share infected, not the reverse.

There is, however, a factor in the statistical analysis which may well be quite important.  The data here show a high degree of correlation (negative correlation, as the line slopes downwards) between the wearing of masks and the incidence of Covid-19 in the locality.  But the data on the wearing of masks may itself be, and indeed likely will be, highly correlated with other actions that may be taken to limit the spread of Covid-19.  Responsible individuals who wear masks likely also are careful to practice social distancing, to wear gloves when shopping, to avoid crowded bars and nightclubs, and to avoid crowded events where many of the attendees do not wear masks (such as Trump rallies).  Thus it may not simply be the wearing of masks that explains why a high share of the local population wearing masks in an area is correlated with a more limited spread of Covid-19:  It is may well be the whole set of socially responsible behaviors that matter.

This is true and should be recognized.  While the direct measure here is the share of the population that mostly or always wear masks, such behavior likely goes together with a full set of socially responsible behaviors that together lead to a lower spread of Covid-19.  While we will often refer to the wearing of masks as the factor that is associated with a limited spread of Covid-19, we should recognize that the wearing of masks likely goes together with a broader set of behaviors that together are important.

C.  A Higher Share of People Wearing Masks is Associated With A Lower Incidence of Self-Reported Cases of Covid-19, and a Lower Official Count of Confirmed Cases of Covid-19 

Two other charts are of interest.  The first examines the association between the share reporting they mostly or always wear masks, and whether they (or someone in their household) is exhibiting the symptoms of Covid-19:

One again sees a strong (negative) association between the wearing of masks and cases of those with symptoms consistent with Covid-19 (in this case of the survey respondents themselves).  And the R-squared measures of the degree of correlation are even higher:  70% for the full sample, and 78% if the single case of Wyoming is removed.  This again suggests that the wearing of masks (along with other responsible behaviors such as social distancing, etc.) is associated with a more limited spread of Covid-19.  Furthermore, the impact is not simply statistically significant, but also large.  Based just on the values on the regression line, a state with a reported 69% who wear masks (such as South Dakota) compared to a state (or locale) with a reported 97% who wear masks (such as Washington, DC) would be expected to have more than 6.1 times the share of cases.  (The actual South Dakota vs. DC ratio is even higher, at over 7, as South Dakota is above the regression line and DC a bit below).

The findings are also consistent with the official counts of new confirmed cases of Covid-19 per 100,000 of population:

The data on the official counts were downloaded from the COVIDcast site, but they in turn were obtained from compilations at USAFacts.  And USAFacts obtained the figures from state public health agencies.

The relationship between those reporting that they wear masks most or all of the time, and the number of confirmed new cases by state (per 100,000 of population, and a seven-day average covering the October 15 to October 21 week), remains significant, negative, and strong.  The states where mask-wearing is a higher share of the population routinely wear masks (as reported in the surveys) see a significantly lower incidence of confirmed new cases of Covid-19.  The statistical relationship is not as strong as before (the R-squared is 47%), but this is not surprising.  The average number of daily new confirmed cases over the 7-day period (October 15 to 21) counts only those with a test result, for a new case, reported over those seven days.  The number of people who are sick with Covid-19 will include not just those newly-tested individuals, but also others who have been sick for some time plus individuals with Covid-19 like symptoms who may have the disease but have not (or not yet) been tested.  It is not surprising that the correlation of mask-wearing with just a slice of the population who are sick with Covid-19 will be weaker.  But the R-squared of 47% is still quite high.

D.  Conclusion:  The Effectiveness of Wearing Masks

Masks work by reducing the transmission of an infectious disease to and from others.  They are not perfect.  But neither do they need to be perfect, as one can see from the simple arithmetic of the spread of an infectious disease.

Infectious diseases are viruses, which cannot survive on their own but can only survive by spreading from person to person.  Any individual will have a disease such as Covid-19 for a finite period of time (a few weeks, normally, in the case of Covid-19) beyond which they would either have recovered or (in a small percentage of the cases) have died.  And they will normally only be able to infect others for about a week (starting one week after they themselves had become infected), although possibly for up to two weeks.

Any such infectious disease will therefore spread when, on average, each individual with the disease spreads the disease on to more than one other person.  And given the arithmetic of compounding, that number can grow to be very large very quickly.  If each individual on average infects 2 other individuals in each cycle, then after just 10 cycles the one individual with the disease would have led to the infections of over 1,000.  It doubles in each cycle.  If each cycle is, on average, a week and a half (one week for the virus to multiply in the individual, and then one week during which the person can be infectious, so on average will infect others at the mid-point of the second week), those 10 cycles will require only 15 weeks.

But if the wearing of masks (along with other socially responsible behaviors, such as social distancing) reduces the average number of people that an individual with the disease will infect to less than one, then the disease will die out.  And again, with the arithmetic of compounding, this can be quite quick.  Suppose one starts out with 100 individuals with the disease in some locality.  If, on average, each infected individual spreads the disease to another person only half the time, then 100 individuals will spread it to 50 during the first cycle, to 25 in the next, and so on.  One can calculate that if this continues at such a rate, then less than one new person would become infected after just 7 cycles (or 10 1/2 weeks if each cycle is on average a week and a half).  And the disease would have been stopped.

Masks work because they can bring down that reproduction rate (what epidemiologists call Rt) from something above 1.0 to something below.  The example here is that masks (along with other socially responsible behaviors) reduced the Rt to 0.5.  This would be a 75% reduction if the Rt is 2.0 when nothing is done to stop the spread of the disease.  That is not perfect, but it does not need to be perfect to stop the spread.  And 70 to 80% is a reasonable estimate of how effective masks are.  If the US were to reduce the Rt to 0.5 going forward, then the daily number of new cases (currently, as I write this, about 80,000 each day) would fall to less than 100 in just 10 cycles (15 weeks).

This is of course just arithmetic, but the power of compounding is extremely important to recognize when addressing how to bring an infectious disease under control.  Masks do not need to be 100% effective – they merely need to bring the Rt down to less than 1.0.  And in this they are similar to vaccines.  No vaccine is 100% effective.  For the virus that causes Covid-19, the FDA has issued guidelines stating that a vaccine that is safe and has a minimum effectiveness of just 50% would be approved.  It is hoped that the vaccines currently being tested will have a greater degree of effectiveness, but the expectation is that they might at most be perhaps 80% effective, and probably 70% or less is more likely.

That does not mean such vaccines would not be valuable.  As just noted, a vaccine that brought the Rt down to 0.5 would lead to the disease dying out in a relatively short time.  But as Dr. Robert Redfield, the head of the CDC, noted in testimony before Congress on September 16, the effectiveness of masks is similar if not greater than what is expected for a vaccine.  In that testimony he stated, as he has in other fora in recent months (see here and here, for example), that if Americans wore these simple masks, that in “six, eight, 10, 12 weeks we’d bring this pandemic under control.”  And further in that testimony: “I might even go so far as to say this face mask is more guaranteed to protect me against COVID than when I take a COVID vaccine, because the immunogenicity might be 70%, and if I don’t get an immune response the vaccine’s not going to protect me. This face mask will.”

But there is an important proviso.  These effectiveness percentages, whether for masks or for vaccines, reflect how likely they will protect an individual who is exposed to the virus.  But their effectiveness in reducing Rt will then depend on what share of the population wears a mask or is vaccinated.  Usage of masks or vaccinations will never cover 100% of the population, and the reduction in Rt will then be less.  If not enough people follow responsible social behaviors – most importantly wearing masks – or choose not to be vaccinated once a vaccine becomes available, the virus will continue to spread.

Political leadership is therefore critical, but Trump has been unwilling.  Despite the uniform advice of medical professionals in the field, Trump has been unwilling to call on all Americans, and in particular all of his supporters, to wear masks.  He rarely wears masks himself, makes a big show of pulling it off when he has had to wear one (such as when he returned to the White House from Walter Reed Hospital, where he had been treated for Covid-19), and continues to organize large political rallies where few wear masks (but with participants required to sign legal waivers saying that should they become infected as a result, they cannot sue the Trump campaign).  And Trump continues to mock Joe Biden and others who are conscientious in wearing masks when in public.

Why?  Wearing a mask makes it obvious that an infectious disease is circulating.  It makes it obvious that Trump and his administration have failed to bring this terrible disease under control.  Trump continues to assert instead, as he has from the start as well as more recently (during, for example, the second, October 22, debate with Joe Biden), that all is under control and that while there have been “spikes” they are all either “gone” or “will soon be gone”.  From the start in January, Trump has repeatedly asserted that it was “totally under control”, that “It’s going to be just fine”, that it was just a hoax (indeed, a “new hoax” of the Democrats), and that it would soon (Trump asserted in February) just disappear (“like a miracle”).  And Trump’s repeated assertion that “it’s going away” is well-documented in this Washington Post video compilation.

But cases are in fact rising as I write this, and rising rapidly.  Confirmed cases hit over 83,000 on October 23 and then over 83,000 again on October 24 – they had never before exceeded 77,300 in a single day in the US.  Hospitalizations are rising as well, and the surge in hospitalizations is starting again to overwhelm hospitals in parts of the country.  It is absurd to say, as Trump repeatedly insists, that cases are rising only because more testing is being done.  (As one wag put it:  “I stopped gaining weight as soon as I stopped weighing myself.”)

The number of dead in the US from this disease now exceeds (as I write this) over 228,000.  That exceeds the number of soldiers who died in battle in the US Civil War (Union plus Confederate together) of 214,938.  It is 70% greater than the 134,575 Americans who died in battle in World War I plus the Korean War plus the Vietnam War, combined.  This has been the worst public health crisis in the US in more than a century.  Yet Trump claims he has been a great success.

The widespread wearing of masks would be an obvious signal of Trump’s failure.  It is understandable (but not defensible) that he would want to hide such overt signs of his failure before the upcoming election.  But to put short-term politics above public health concerns is deplorable.

Death Rates due to Covid-19: An International Comparison

A.  Introduction

In an interview in early August, when over 1,000 Americans were dying each day due to Covid-19, President Trump was asked how he could consider the disease to be then under control.  He responded “They are dying, that’s true”, and then went on to say “it is what it is.  But that doesn’t mean we aren’t doing everything we can.  It’s under control as much as you can control it.”

If it were true that the disease was “under control as much as you can control it”, then deaths in the US would be similar (as a share of population) to what they are in other countries around the world.  It is the same disease everywhere.  And it would especially be true now, more than nine months into this pandemic.  While much was still not known in the early months on how best to bring this terrible disease under control, we now know what has worked in other countries plus we have results from numerous scientific studies.

In particular, it has become clear that a highly effective measure to contain the virus is also the simplest:  Everyone should just wear a mask when out in public.  The experience of East Asian countries, which will be examined below and where mask-wearing was common even before Covid-19, is consistent with this.  There are also now scientific studies backing this up, as discussed in an editorial published on July 14 in JAMA – the Journal of the American Medical Association.  Dr. Robert Redfield, the head of the CDC, was a co-author of that editiorial, and in interviews and press conferences since he has made clear that if everyone simply wore a mask when in public, the disease would be brought under control in as little as four to eight weeks.

Dr. Redfield said the same in testimony to Congress on September 16 (although with a more cautious time scale, allowing between 6 and 12 weeks for the pandemic to be brought under control).  Indeed, Dr. Redfield noted in that testimony that wearing of masks could be more effective than even a vaccine, as any vaccine that is developed will likely have an effectiveness of 70% or less.  A mask, if worn, can do better.

But getting most of the population to wear a mask requires political leadership, and that has been sorely lacking under President Trump.  Indeed, under Trump the wearing of masks has been turned into an issue of political identity, and he has even mocked Joe Biden and Democrats generally for wearing them.  Trump also asserted, on the same day as Dr. Redfield’s congressional testimony, that the doctor was wrong in his medical advice on masks.

The sad result is that death rates from Covid-19 in the US are now not simply higher than in many other countries around the world, but higher by a large multiple.  There is no basis for asserting that this disease is “under control as much as you can control it”.

We will examine here what other countries have been able to achieve in comparison to what the US has, basically through a series of charts.  A word on the data:  The figures were all calculated from the reported deaths by country from Covid-19 downloaded from the site maintained by the Center for Systems Science and Engineering at Johns Hopkins University.  The data were downloaded on the afternoon of September 15, with the country data current through September 14.

B.  US Compared to Canada and Europe

The chart at the top of this post shows the number of deaths from Covid-19 per day per million of population (based on a rolling seven-day average ending on the date shown), from January 29 through to September 14, in the US, Canada, and Western and Eastern Europe (with Eastern Europe covering the Baltics through to Albania).

Starting with the US, deaths rose rapidly in late March and early April, peaked in mid-April, and then fell.  This continued until early July.  But then, as a number of states rushed to re-open their economies in May and especially June (with the strong encouragement of Trump), death rates rose again, doubling from their not-so-low early-July lows.  They then came down modestly in August and the first half of September, but remain far higher than elsewhere.

The profiles in Europe and Canada are different in an important way.  While death rates rose early in Western Europe (and to rates higher than what came later for the US), when much was still not known about the virus and how it was spread, they were then brought down to very low rates – well below those of the US.  And they have remained low (at least so far).  This is in contrast to the US, where death rates rose in July as lessons on how to manage the virus were ignored.

Canada followed a similar profile to that of Western Europe, although with an initial peak that came later (and with a substantially lower peak – only half that of Western Europe), with then a decline to low levels that have remained low.  In Eastern Europe, early rates in the spring never rose that high, but then still came down by June.  Since then they have risen some, but to rates that remain well below those of the US (at less than a third of the US rate, as of mid-September).

Breaking this down for some of the major countries of Western Europe:

Rates peaked early and at high levels in Italy, France, and the UK, but then all came down and remained down.  The peak in Germany came at roughly the same time as that of the US (but at well less than half the US rate), and then came down to an extremely low level.  As of mid-September, the death rate in Germany is only 2% of the US rate.  If it’s “under control as much as you can control it” in the US, as Trump asserted, why is it that the death rate, per million of population, can be 98% less in Germany?

There are two special cases in Western Europe that are worth examining – Spain and Sweden:

Rates rose rapidly and to quite high levels in Spain early in the crisis.  Its hospital system was overwhelmed and many died.  But then Spain brought down the rates to very low levels by June and July.  They have, however, trended up since mid-August, as it appears Spain opened up its seasonal tourism industry too rapidly (tourism as a share of GDP is far higher in Spain than in any other OECD member country).  But even with the recent increase, the number of deaths per million in Spain remains less than half (45%) of what the rate is in the US as of mid-September.

(One might also note the negative numbers recorded for the number of deaths in Spain due to Covid-19 for a period in late May, as well as an odd spike up in late June.  The reason for this is that Spain revised its counts of the number who had died from Covid-19 as they later reviewed what had been submitted during the peak of the crisis.  A focus on the statistics was not the highest priority earlier – saving lives was.  It is of course impossible for there to be a negative number of deaths.  But figures are recorded each day for the cumulative number of deaths due to Covid-19, and when that total was revised down on May 25, the daily change in the total (which is the basis for the daily death count) will be negative (and will be negative for a week, as the numbers are seven-day averages).  And a later upward revision in late June will look like a spike up.)

Sweden is also an interesting case as, early in the crisis, it deliberately decided not to mandate closures of restaurants, offices, and other non-essential work locations, but rather left this to be decided by each entity.  But the policy failed:  Deaths from Covid-19 rose to rates well above US levels (and was especially far above the rates of its Nordic neighbors of Norway, Finland, and Denmark, although below the peak levels seen in Italy, Spain, France, and the UK).  The rates then fell relatively slowly in Sweden.  They eventually moved to policies more in line with the rest of Europe, and eventually saw similarly low rates.

D.  US Compared to East Asia, Australia, and New Zealand

As an earlier post on this blog on the number of Covid-19 cases discussed, the countries of East Asia, as well as Australia and New Zealand, show what is possible if serious measures are taken to control the spread of the virus (and possible in a region with more travel and business exposure to China than any other region).  The measures required are not exotic.  Nor did they require resources that others did not have.  All that was required were the standard public health measures used to control the spread of any infectious disease – extensive testing with follow-up tracing of contacts and quarantining of those exposed, plus the normal and widespread use of simple masks.  With such measures, Taiwan was able, for example, to keep open its schools basically throughout (in February it extended its regular Chinese New Year holiday by an extra two weeks, but has since followed its regular schedule).

The result was few cases of Covid-19, and few deaths:

 

The rates for all the countries listed on the chart were plotted.  But they were all so close to zero that, other than for the few names shown, one could not distinguish one from the other.

There was an increase in the rates since mid-July in Australia, and to a lesser extent in Hong Kong (and a far lesser extent in Japan), as some of the earlier controls were eased.  But these have all now been brought back under control.  And even with these outbreaks, the rates never approached the US rates.

E.  Who are the Comparables for the US?

Who, then, might have a record comparable to that of the US?  Among the larger countries:

Donald Trump can be proud to say that death rates in the US have, since June, been lower than the rates in Mexico and Brazil.  The US has not performed as poorly as they have.  The pattern in South Africa is somewhat odd in that its rates were higher than those of the US between mid-July and mid-August, but are now substantially less.  And Russia as well as India have had lower rates throughout.

All this assumes the tracking statistics on deaths from Covid-19 are accurate, and one might question this for some of these countries.  As was discussed above for the case of Spain, such numbers can be difficult to assemble even with resources that the countries here do not have.  But for the ranges in the numbers seen here, the conclusions would still hold even if the rates were substantially higher.  As of mid-September, the South African rate would have needed to have been twice as high, and the Indian and Russian rates three times as high, to reach the US rate.

Note that I have not included China.  If it were added, it would show extremely low death rates per million throughout, with a peak of just 0.1 in mid-February.  But while the deaths from Covid-19 may well have been low compared to others (particularly when expressed per million, given its population), I am not confident they were in fact that low.  Restrictions on the news media and what they can report do not engender confidence.

But overall, to find countries with records on management of Covid-19 comparable to what they have been in the US, one needs to look at countries with per capita incomes that are far below that of the US.  The US has thought of itself as belonging in the top rank of countries.  But for this, the only countries with comparable death rates from Covid-19 are countries that, before Trump, the US had not normally been grouped with.

F.  What Deaths in the US Would Have Been at the Rates Other Countries Have Been Able to Achieve

As noted at the top of this post, President Trump claimed that the disease is “under control as much as you can control it.”  But as we have seen, it is not.  Other countries, facing the same disease, have been able to manage it with far lower death rates than the US has had.  How much of a difference would this have made?

Little was known about the disease early in the crisis, and one can argue that countries were searching then for what best to do.  And after the high early peaks, the rates did come down in the US as well as in Europe and Canada.  But then the US reversed course while rates continued to fall elsewhere.  It is thus this more recent period that most clearly shows the consequences of the choices the US made compared to others.  For the purposes of this exercise, we will therefore look at the period since August 1.

From August 1 to September 14, a period of 45 days, US deaths totaled 40,459.  This is a bit over a fifth (21%) of the total US deaths as of September 14 of 194,493.  It is still a substantial figure:   The number of US soldiers who died in battle in the Korean War totaled 33,739, and the number who died in the Vietnam War totaled 47,434.  But based on the numbers of deaths per million in other countries and regions, how many would have died for a population equal to that of the US?:

If the US had had the number of deaths per million that Romania had over this same period, then 31,700 would have died, or about three-quarters of the number of Americans who died.  If the US had the rate of Albania, about 20,800 would have died, or about half the number of Americans who died.  One might ask that if “it is what it is”, and that “It’s under control as much as you can control it”, why is it that Romania could control it so that there would only be three-quarters as many deaths, and Albania could control it so that there would only be half as many deaths?  Neither Romania nor Albania has the resources the US has, plus they are small and open.

Other cases are more extreme.  If the US had the rate over this period of the EU as a whole, there would have been 5,465 deaths.  Instead, it was 7.4 times higher.  At the rate of Canada, there would have been 2,184 deaths.  Instead, it was 18.5 times higher.  And Singapore and Taiwan both had zero deaths over this period.  The most recent death (as of this writing) was on July 14 in Singapore and on May 11 in Taiwan.  If the US had their rates, there would have been no deaths.

There is of course a wide range here.  Plus things may change.  Infection rates have been rising in Europe in recent days, and increases in death rates may soon follow.  The US has also today (on September 22, as I write this) passed a significant milestone:  More than 200,000 have now died in the US from this disease.  And there are widespread concerns that rates will increase this fall and winter across the Northern Hemisphere in a “second wave”, as more people remain inside and as they become less vigilant as time goes on. One has seen this with prior infectious diseases, particularly those that spread through the air.  There is also increasing pressure to reopen schools for in-class teaching and to fully reopen businesses.

So there is uncertainty on how this will progress.  But based on what we know for the last month and a half, a question to address is why the Trump administration has not been able to do as good a job of reducing deaths from this virus as have the governments of Romania, Albania, Bulgaria, Russia, Spain, Australia, Croatia, Serbia, Luxembourg, Portugal, Poland, France, Greece, Hong Kong, Italy, Sweden, Czechia, Slovenia, the Netherlands, Belgium, the United Kingdom, Canada, Switzerland, Hungary, Austria, Ireland, Japan, Denmark, Lithuania, Germany, Norway, Slovakia, Latvia, Finland, South Korea, Estonia, New Zealand, Singapore, and Taiwan.

The Spread of Covid-19: Trump States vs. Clinton States – An Update

An earlier post on this blog compared the spread of Covid-19 in the states that Trump had won in 2016 to that in the states won by Clinton, with data through June 24.  This post will update those figures to July 16.  The trends have become even clearer.

As seen in the chart above, new cases in the states won by Trump have continued to shoot upwards, at an alarming pace.  They had reached 22,000 new cases per day as of June 24 (based on a seven-day rolling average ending on that date), but have now (as of July 16, just three weeks later) more than doubled to 48,500.  The decisions to rapidly reopen by the governors of such Trump-won states as Florida, Georgia, Texas, and Arizona, as well as others, have clearly been a disaster.  The virus is now spreading rapidly in those states, and some of these governors are now putting back in place (albeit only partially) the social distancing measures that had earlier worked.

Daily new cases are also now clearly increasing in the states won by Clinton.  This trend was still too recent to be clear in the data through June 24.  But the pace of spread in the Clinton states is far below that of the Trump states, and the number of new daily cases in the Clinton states (16,500 as of July 16) is only one-third the number in the Trump states.

The trends in the figures for the number of deaths from Covid-19 have also now become clear:

In the previous data through June 24, the daily number of deaths (again based on seven-day rolling averages) had come down from their mid-April peaks to a relatively flat level as of mid-June.  This had marked a sharp decline of over 80% in the daily number of deaths in the Clinton states (where peaks early in the crisis in New York had overwhelmed the hospital system, at a time when still little was known on how best to treat the extremely sick), and by a lesser but still significant decline (about 50%) in the Trump states.

Since mid-June, the daily number of deaths in the Clinton states has been relatively flat (hovering between about 200 and 300).  But there has now been a significant increase in deaths in the Trump states, rising from a trough of about 280 per day to now almost 500, an increase of about 75%.  And the path points to a continued rise, as one would expect given the even sharper rise in daily new cases (as there is a lag – deaths occur several weeks after when a case is first confirmed).

These trends should be worrisome in the extreme.  They are not the consequence of increased testing in the US, as Trump has repeatedly asserted.  While testing was slow to start in the US (the administration had bungled the roll-out in February and into much of March), there has not been a significant change in test availability since mid or late April, and certainly since May.  The increases in cases started in June.  More people are now being tested because more people are getting sick, and seek a test as they come down with the symptoms.  And the increase in the number of people dying from the disease is certainly not a consequence of testing, but rather of more people becoming sick.

More could be done, but sadly this presidential administration isn’t.  And it would not be all that difficult.  As I had noted in my June 25 post, a relatively easy measure would be for everyone to wear masks.  Since that post, Robert Redfield (the head of the CDC) noted in an interview on July 14 that “if we could get everyone to wear a mask right now, I really do think that over the next four-six-eight weeks we could bring this epidemic under control” (see this YouTube video of the interview, starting at about the 4-minute mark).  He noted that this is not difficult – the problem is just that not enough people do it.

For many of those refusing to wear a mask – some adamantly so – the issue is seen as political.  The problem started with Trump, where at the April 3 press conference announcing CDC guidelines calling on people to wear face masks, Trump simultaneously emphasized that he would not himself abide by those guidelines.  With any other president, this would be unbelievable.  Since then, supporters of Trump have increasingly seen the issue as one of making a political statement rather than as the public health matter that it is.  A recent academic study found that political partisanship is the most important factor in explaining whether or not people will wear masks and exercise other social distancing recommendations, and that this partisan difference has grown over time.

This has even become violent.  In early May, for example, a security guard at a Famlly Dollar store told a customer she would need to wear a face mask to enter, as per the state orders of the time.  She returned with members of her family about 20 minutes later who shot the guard, who died.  More recently, a 43-year old man entering a convenience store without a mask was asked by another customer to put on a mask.  He responded by stabbing the 77-year old customer.  The man then fled, was later spotted by police, and started to attack the policewoman who then shot him.  He died.  And there have been, sadly, a number of such incidents.

Those refusing to wear face masks when in public insist that such a requirement infringes on their “freedom”.  Thus, as a matter of principle, they refuse to do it.  If it was indeed the case that the only one suffering harm from not wearing a mask was that individual only, I would not be so concerned.  But that is not the case – others exposed may then become infected, and possibly even die.  It is similar to speed limits on highways.  If the only one who might be harmed by speeding is the speeder only, I would not be so concerned.  But speeders may harm, and possibly kill, others as well.  Hence we have speed limits and those limits are enforced.

Refusing to wear a face mask under a belief that it is an infringement on freedom, and responding with threats or even violence when asked to do so, is madness.  With true leadership in Washington we would have a president who would act on this.  Not only would that president model responsible behavior by wearing a face mask himself when in public or when meeting with others, he would also call on all his supporters to do so as well.  They might listen to him.  But his refusal to do so speaks volumes itself.

The Failure of the US to Limit the Spread of Covid-19: A Comparison to What Other Countries Have Been Able to Achieve

A.  Introduction

The virus that causes Covid-19 has struck countries around the world, and it is the same virus everywhere.  But countries have responded differently.  Many countries have responded effectively, and some highly effectively.  The US is not among them.  The experience in other countries shows what would have been possible, had the US responded as they did.  Unfortunately, the US, with Trump leading as president, did not.

B.  The US Compared to Italy, Spain, Germany, and the UK

The chart above shows the daily number of new confirmed cases (on a 7-day moving average basis) since the start of the pandemic through to July 6, for the US plus several of the larger countries of Western Europe:  Italy, Spain, Germany, and the UK.  These countries were chosen in part as they were all hit with the virus that causes Covid-19 earlier than most (including earlier than in the US).  They thus faced a crisis when much was still not known about the virus, including how quickly it could spread and under what conditions, and uncertainty on what should be done to bring it under control.  The underlying data on Covid-19 case totals, from which the figures for the chart were derived, comes from the widely-used data set maintained by Johns Hopkins University.  Population numbers from the UN were used to put the number of cases on comparable terms:  of daily new cases per million residents.

Italy was the first major country in Europe to have been hit by the virus, for reasons still not fully known.  Cases rose quickly, reaching a peak at the end of March.  Spain came next, roughly a week later than Italy at first, but then rose especially quickly to a peak in early April of almost double the peak in Italy.  Germany also had a high number of cases early, but was then more successful through aggressive testing and quarantining to keep the peak from rising as high.  Finally, the UK saw a similar peak to that of Germany, but with that peak then lasting for close to a month.

Each of these European countries was then able to bring their daily new case numbers down sharply, to less than 10 new cases a day per million residents by early July (and indeed by early June for all other than the UK).  Each country had its own policies, and I will not go into the nuances of the country-specific differences here, but they succeeded through a combination of social distancing (including lockdowns), wide use of masks, extensive testing, contact tracing, and then isolation or quarantining of those infected or exposed to someone infected.  And with their success in bringing down the number of Covid-19 cases, these countries are now opening up for business, schools, and travel, and are doing so safely.

The US followed a different path.  Cases rose similarly at first as in these European countries, although with a lag (or about two weeks compared to Italy).  One should be cautious with these early numbers as testing, particularly in the US, was not as complete as was being done later, but the early trends appear to be broadly similar.

But what is important is what happened next.  In contrast to the European countries, who were all able to bring down their case numbers by 90% or more, new daily cases in the US fell much more modestly.  Despite official policies (in much, although not all, of the country) to lock down the economy to limit person-to-person spread of the virus, plus guidelines encouraging (and in some cases mandating, but with lax to no enforcement) the wearing of masks and social distancing, the daily case numbers in the US were reduced only from about 95 per million in early/mid April to a trough in early June that never fell below 60.

US cases then started to shoot up.  This followed the easing of social distancing and other measures to limit the spread of the virus during the month of May.  While there were important differences by state and indeed often by locality, most states started to lift the measures cautiously in early May and much more comprehensively by the end of May (and sometimes completely so by that point).  And as was examined in an earlier post on this blog, the increases in daily cases have been particularly sharp in the states won by Trump in 2016 – states often with governments and a population that have been particularly aggressive in lifting (or increasingly ignoring) those measures.

As a further example of the impact of this politicization of what should be seen as basic public health measures, the number of Covid-19 cases in Tulsa, Oklahoma, have now spiked two weeks after Trump held a large campaign rally in an indoor arena there.  Local health officials have said it is “more than likely” that the two are linked.  Few at the Trump rally wore masks, they were grouped closely together for the cameras, and loud cheering was of course encouraged.  The two week lag from the rally to the spike in Covid-19 cases is about what health experts say one should expect, between when there is exposure to the virus at an event such as this to when confirmed case numbers will rise as results are obtained for people seeking tests following an onset of symptoms.

C.  The US Compared to Europe, Canada, and Sweden

The chart at the top of this post highlights only a few countries.  But the same results hold for Western Europe as a whole as well as for Canada:

Cases in Western Europe as a whole rose early, reached a peak, and then fell.  Since early June cases have remained below 10 per day per million.  As of July 6, they were at 8.3, or less than 6% of the US rate of 149 per day.  The path for the countries of Eastern Europe (the countries from Estonia on the north to Bulgaria on the south, who are now mostly members of the EU) is interesting as they were able to contain the virus throughout, with a peak of less than 14 in early to mid-April.  But a modest increase in recent weeks (to almost 15 currently) warrants watching.

Canada is also interesting as the economy and the population are broadly similar to that of the US, but with very different politics.  Cases rose in Canada to a peak of about 50 in mid-April.  But they were then brought down, to levels now very similar to that of Western Europe.  Again, this is in sharp contrast to the US.

Sweden is an exception to others in Europe.  It is also the one country of the rich Western democracies that explicitly followed a different policy path.  Instead of mandating a lockdown of the economy, the wearing of masks, social distancing, and other such measures, it only issued general guidance.  And even this guidance was eased later.  Daily cases per million then reached about 60 in late April, fell only modestly to about 50 in late May, before increasing significantly to as much as 120 at points in June (although with erratic numbers that probably reflect reporting practices).  Sweden is now taken as a good example of what not to do.  Furthermore, while “protecting the economy” was presented as a rationale for Sweden’s decision to issue only general guidelines, with no requirement for businesses such as restaurants to close, early evidence indicates that the Swedish economy has suffered similarly to those of its neighbors.  There was no economic gain, but a profound human loss in sickness as well as lives.  As I write this (July 9), the accumulated number of deaths per million of population has come to 545 in Sweden, or roughly ten times the totals of 46 in neighboring Norway and 59 in Finland.

D.  The US Compared to East Asia, Australia, and New Zealand

Europe (with the exception of Sweden), as well as Canada, have therefore been far more successful than the US in limiting the spread of the virus that causes Covid-19.  But the countries that have been by far the most successful in containing the virus have been those of East Asia, as well as Australia and New Zealand:

Drawn on the same scale as the other charts, one can barely distinguish their case levels, other than during a few, and still always low, periods (in early March in South Korea and in late March and early April for most of the others).  And the daily case rates in Taiwan were never over 1 per million of population, so one cannot distinguish its curve from the horizontal axis of the chart.  Yet Taiwan has probably closer contact with China, from business relationships as well as personal travel, than any other country in the world other than Hong Kong.

All of these countries reacted quickly as soon as it became clear that an infectious disease had spread in China.  While travel limits were imposed, these limits were complemented by extensive testing and contact tracing, quarantining of all travelers (whether citizens or not), and wide use of masks and other social distancing measures.  None of this was secret.  Nor did it require special expertise.  Others could have responded similarly, but did not.

E.  Countries with a Similar Result as the US

Which, then, are the country cases that are broadly comparable to that of the US?  The closest are Brazil and South Africa, with similarities also in the cases of Russia and Mexico:

These are not countries that the US would normally compare itself to.  One should certainly be cautious and note that the quality of the case number data may not be all that good in some of these countries (and indeed, it is not all that good in the US itself).  But the patterns are probably broadly accurate.

Brazil is the one major country in the world with more confirmed cases (per million of population) than the US.  Its right-wing president, Jair Bolsonaro, has responded to the virus in many ways similar to Trump.  He has consistently downplayed the virus (like Trump), has refused to wear a mask (like Trump), has encouraged rallies to oppose rules on social distancing that some Brazilian states and localities had issued (also like Trump), and has insisted that the disease is not serious but rather “It’s just a little flu or the sniffles”.  And like Trump, he accuses the media of stoking hysteria.

The result is that the number of cases in Brazil per million of population is now the highest of any large country in the world, and indeed second only to the US in absolute total number.  And on July 7, Bolsonaro himself tested positive for the virus.  Again like Trump (who took the drug when he was possibly exposed to the virus), Bolsonaro is now taking doses of hydroxychloroquine as a treatment, even though there is clear evidence that this drug does not help with Covid-19 and may in fact do harm.

Other countries with rising numbers of new cases include South Africa and Mexico.  The daily cases for South Africa now match the US number, with a path since mid-June broadly similar to the US path.  Russia saw an increase in April to mid-May, after which there has been some decrease.  But the daily numbers in Russia remain high.

F.  Conclusion

There is not much here for the US to be proud of.  While countries in Western Europe, as well as Canada, saw sharp increases in cases in much of March and early April, they were then all (with the notable exception of Sweden) able to bring the rates for new cases down to modest levels.  With that success, they are now reopening their economies, are permitting travel (other than, notably, to and from the US), and will be reopening schools.  They are all still cautious, and maintain aggressive efforts at testing, contact tracing, and then quarantining when warranted, but their success in bringing down the daily case numbers means they can, albeit carefully, resume a degree of normalcy.  It is possible that things will take a turn for the worse in the weeks and months ahead.  Until there is an effective vaccine that is broadly available, there will remain conditions in which the virus could pop up and cause major disruptions again.  But the situation in these countries has remained stable there for more than a month now.

Countries in East Asia, as well as Australia and New Zealand, have done far better.  They kept rates low from the start and have thus been able to reopen safely and more quickly.  Indeed, schools in Taiwan never even closed (other than for a two-week extension of the traditional Chinese New Year holiday in February).  But Taiwan then opened schools safely, with students required to wear masks, temperature checks carried out daily of all students, and with plastic shields installed to separate desks from each other.  [Not everyone liked this.  I know from direct personal information that at least a few elementary school age children thought it horribly unfair that they have had to go to school while children around the world were able to stay home.]

So who resembles the US in effectiveness in limiting the spread of the virus that causes Covid-19?:  Among the larger countries of the world, only Brazil and South Africa, and to some extent Mexico and Russia.  In the past, they were not the countries the US would see as comparables.  But they are now.