A Calculation of Covid-19 Case Rates for Those Vaccinated and Those Not Vaccinated

A.  Introduction

The vaccines available for the virus that causes Covid-19 are incredibly effective – far better than the vaccines for many diseases – but can only work if they are used.  Sadly, a substantial share of the US population, particularly those who identify as conservative and Republican, are declining the opportunity to be vaccinated.  While there are also those on the left who have indicated they will not be vaccinated against this terrible virus, a recent (June) Gallup survey found that while close to half (46%) of Republicans said they do not plan to accept the vaccinations, just 6% of Democrats said so.  A Washington Post – ABC survey in early July found almost exactly the same results, with 47% of Republicans saying they are not likely to get the vaccination, while again only 6% of Democrats said so.

It might help convince those who are reluctant to be vaccinated to see in hard numbers how effective the vaccines have turned out to be.  Part of the responsibility of the CDC is to collect and report data on disease incidence in the US as well as on the cause of all deaths in the country.  All hospitals, doctor’s offices, clinics, and other health centers in the US, are required to report these to the CDC, and the CDC in turn then consolidates the information and makes it available to the public and to researchers.  For Covid, the CDC has put together a separate online site providing extensive data on the spread of the disease in the country, calling it COVID Data Tracker, with the multiple individual data series updated daily.

For vaccinations against Covid, the CDC provides not only daily figures on the number of vaccinations given (down to the county level), but with this broken down by various demographic dimensions, including gender (male/female), age (9 different age groups), and ethnicity (7 groups).  Each of these are tracked daily, so one can also determine trends over time.  There is also daily tracking for each state (and perhaps county – I did not check) of the number of doses of each type of vaccine administered (Moderna, Pfizer, and Johnson & Johnson, and whether it was the first or second dose for Moderna and Pfizer), for key age groups (over age 65, over 18, and over 12).  Many of the charts the CDC provides are then picked up in the regular news media, so people can see daily the trends in Covid-19 cases, deaths, vaccinations, and other such important information.

But the CDC is not reporting what would be an extremely useful, and hopefully convincing, daily statistic along with these numbers.  And that would be not only the total number of new confirmed Covid cases that day, new hospitalizations due to Covid, and deaths due to Covid, but also how many of these are among those who have been vaccinated and among those who have not.  The simple figures could be provided, or, more usefully, expressed as a count per 100,000 in the relevant population (of vaccinated and unvaccinated).

The CDC is not doing this, and it is not clear why.  It may feel that the data it has is not of sufficiently high quality, but if so, one would think that a high priority would be to take whatever measures are necessary to upgrade that quality.  Indeed, it is difficult to think of anything that would be a higher priority than this.

This post will discuss what a chart with such a breakdown between vaccinated and unvaccinated may look like.  This chart (at the top of this post) is not based on directly reported data on Covid cases among the vaccinated and unvaccinated, as the CDC has not made whatever it has on this publicly available (at least among what I have been able to find – specialists in the field may have access to more of what the CDC has).  Rather, the chart presents what the rates would be, given the observed number of daily new Covid cases in the US, and assumptions on how effective the vaccines have been.

Given the high degree of effectiveness of these vaccines in preventing Covid cases, and even more so in preventing Covid deaths, why are so many people refusing to be vaccinated?  Sadly, identity politics has intervened, with many supporters of former president Trump appearing to take vaccination as a sign of disloyalty to a cause they believe in.  As we will see, there is a very strong negative correlation by state between the share of the population who have been vaccinated and the share who voted for Trump.

B.  Covid Case Rates Among Those Vaccinated and Those Unvaccinated 

Despite its extensive reports on Covid cases and vaccinations, one key dimension that the CDC does not report is the breakdown of the daily number of new Covid-19 cases, hospitalizations, and deaths, between those who have been vaccinated and those who have not.  Given the high degree of effectiveness of the vaccines (as observed in the clinical trials and in numerous studies since), one should expect huge disparities in such rates between these two groups.  And seeing such disparities daily in the news might convince at least a few, and hopefully many, of those hesitant to be vaccinated to accept that they should indeed be vaccinated.  It truly is a matter of survival.

So why doesn’t the CDC report this?  One would think that if the CDC does not obtain such reports already, that their highest priority would be to set up the system to ensure such numbers are gathered.  Anyone being tested for Covid-19 will certainly be asked if they have been vaccinated.  And the first question that will likely be asked of anyone entering a hospital for suspected Covid-19 (even before they are asked their name and medical insurance number) is whether they have been vaccinated.  Such information would then be recorded in (or certainly could be recorded in) the reports sent to the CDC on the number testing positive for Covid-19, and would certainly be available for anyone who has been hospitalized and then dies from Covid-19.

The CDC does appear to obtain at least partial information on the number of Covid-19 cases, hospitalizations, and deaths where vaccination status is known.  But they have evidently not sought to organize a more complete and reliable system.  In late May they issued a brief, non-technical, summary of data obtained on cases of Covid-19 among individuals who had been vaccinated (there were a total of 10,262, in reports from 46 states), covering the period between January 1 and April 30, 2021.  But they did not put this in any context, and while they noted that 101 million Americans had been fully vaccinated as of April 30, there were far fewer (essentially zero) as of January 1.  By itself, this report was basically useless.  And then the CDC announced that as of May 1, it would no longer even seek to collect comprehensive data on the vaccination status of confirmed cases of Covid-19, but rather seek this only for those hospitalized due to Covid-19 or who had died from it.

It is difficult to understand why the CDC would scale back the reporting of Covid-19 cases to it, rather than upgrade the quality and completeness of what is reported.  Furthermore, even with incomplete data on vaccination status of confirmed cases of Covid-19, as well as of hospitalizations and deaths from Covid-19, the CDC could still report the figures for those known to be vaccinated, those known not to be vaccinated, and those where vaccination status was not known.  Such a breakdown would still show the (likely huge) disparities in the rates between those known to be vaccinated and those not.  And gathering such data is critically important not only in determining the continued effectiveness (or not) of the vaccines overall against Covid-19 as mutations of the virus develop, but also the continued effectiveness (or not) of the several individual vaccines that have been approved (i.e. Moderna, Pfizer/BioNTech, and Johnson & Johnson, so far in the US).

In the absence of such real world data, the chart at the top of this post presents what the Covid-19 case rates would be under specific assumptions (based on the clinical trial results and more recent studies of efficacy among different groups) of what the effectiveness rates are.  Specifically, based on the clinical trial results as well as numerous studies undertaken for various special groups since, plus being conservative (given the new mutations that have developed and spread over this time period) and rounding down, I assumed effectiveness rates of:

a)  90% for those fully vaccinated (both shots) with either the Pfizer or Moderna vaccines (and with a lag of 14 days from the second shot).

b)  70% for those partially vaccinated (one-shot) with either the Pfizer or Moderna vaccines (and with a lag of 14 days from that shot).

c)  70% for those vaccinated with the one-shot Johnson & Johnson vaccine (again with a 14-day lag).

[Side note:  One will often see the terms “efficacy” and “effectiveness” used interchangeably in describing how well vaccines work.  Technically, efficacy refers to how well (how effective) the vaccines perform in clinical trials, while effectiveness is the term used for how well the vaccines perform in the real world.  For non-specialists, the distinction is not important.]

The chart at the top of this post shows what the respective number of daily newly confirmed Covid-19 rates would be for those fully vaccinated with the Pfizer or Moderna vaccines and for those not vaccinated with any vaccine (per 100,000 of population in each group) from December 27, 2020 (14 days after vaccinations began for the general public on December 13) to July 12, 2021.  To reduce clutter in the chart, I did not show the respective curves for those partially vaccinated or those vaccinated with the Johnson & Johnson vaccines, but those numbers were part of the calculations as the fact that some were vaccinated in this way will affect the position of the curves.  One has to solve a small algebra problem, as the data one has to work with are only the daily number of new cases, the number fully or partially vaccinated with one of the vaccines, and the assumed effectiveness rates (where effectiveness is defined relative to those not vaccinated).  The curves for the groups who had been partially vaccinated (as of a particular date) or who had had the Johnson & Johnson vaccine, would be in between the two curves shown (given the assumed 70% effectiveness rates for each).

The chart presents what the curves would be for daily new cases of Covid-19.  Also of interest are the number of those being hospitalized (indicating severe cases) and those dying from this disease.  In principle, one could prepare similar charts.  But here there is not as much data to go on to underlie the assumptions to be made on vaccine effectiveness.  It is however clear that, given how they function, the vaccines will likely be a good deal more effective in preventing serious cases of Covid-19 (those that require hospitalization) and of deaths than their effectiveness in preventing any type of case.  The reason is that exposures to the virus will be similar for those who are vaccinated and for those who are not:  The virus is floating in the air (due to a contagious person nearby) and it passes up the nose of some of those passing by.  But the difference then is that as the virus starts to replicate in the person’s lungs and body, the immune system of a vaccinated person will be primed and prepared to respond quickly, thus (in most cases) stopping the virus before it has replicated to levels that lead to a detectable illness.  But with 90% effectiveness, a detectable illness will still occur in 10% of such cases.

Most of these illnesses will then be mild, as the immune system in a vaccinated person is already acting to drive down the virus.  But some share of these will not be, possibly due to how the individual’s body had responded to the vaccine.  Still, since it will be some share (well less than 100%) of the 10%, the number will be small.  And it will likely be an even smaller share for those cases that turn out to be so severe that the patient dies.

With such small numbers, the effects will not be easily picked up in clinical trials.  For the clinical trial used to assess the efficacy of the Pfizer/BioNTech vaccine in the US, for example, one had 43,000 volunteers enrolled, of which half received the vaccine and half received a similar looking shot but which had just saline (salt water) in it.  Neither the patient nor the doctor overseeing the shots knew which it was – there was simply an identifying number to be revealed later.  The volunteers would then go about their lives as they had before.  Over time, some would then come down with Covid-19 (as Covid-19 was present in the country and spreading).  Those that did were treated as any other Covid-19 patient would be.  Once a certain number of cases arose (determined based on statistics, with 170 the trigger in the Pfizer/BioNTech trial), the identifying numbers were then, and only then, revealed.  When they were, they found in this trial that 162 of those cases were of individuals who had received just saline (i.e. no vaccine) while 8 had received the vaccine.  This thus showed 95% efficacy (as 8 is 5% of 162, the case burden among those not vaccinated).  Note this is for the effectiveness against getting a case of Covid-19, whether mild or severe.

For hospitalizations, the numbers will be far smaller.  In the Pfizer/BioNTech trial, only 10 of the 170 cases were “severe”, and of these 9 occurred in those who had the shot of saline while just one was in the vaccinated group.  This is far too small a sample to come up with a figure for how effective the virus is against severe cases requiring hospitalization, although it is clear that it helped.  And with no deaths at all among the 170, one can say even less.

There should, however, now be data on this in CDC files as close to half of the US population has been fully vaccinated.  The CDC has not reported on this.  But the Associated Press, working with experts, was able to find relevant data in the CDC files (possibly in files accessible to researchers with special software – I could not find them).  The AP reported that for the month of May, only 1,200 (1.1%) of the 107,000 patients who had been hospitalized for Covid had been vaccinated.  And only 150 (0.8%) of the 18,000 who had died from Covid in the month had been vaccinated.  Since one-third (34%) of the US population had been fully vaccinated as of May 1 (and 43% as of May 31), the shares of those vaccinated will not be small because the number who had been vaccinated were few.  Rather, the numbers give an indication that the vaccines are highly effective against severe cases of Covid developing, and even more effective against patients dying from the disease.  The numbers may well be imperfect, as the CDC has warned, but the impact of vaccination is still clear.

The figures are also consistent with public statements that have been made.  In late June, CDC Director Dr. Rochelle Walensky said that the vaccine is so effective the “nearly every death, especially among adults, due to Covid-19, is, at this point, entirely preventable.”  On July 1, Dr. Walensky said at a White House press briefing that “preliminary data” from certain states over the last six months suggested that “99.5 percent of deaths from COVID-19 in these states have occurred in unvaccinated people”.  On July 8, the White House coordinator on the coronavirus response, Jeff Zients, said “Virtually all Covid-19 hospitalizations and deaths in the United States are now occurring among unvaccinated individuals”.  And on July 12, Dr. Anthony Fauci said “99.5% of people who die of Covid are unvaccinated.  Only 0.5% of those who die are vaccinated” (with his source for these figures probably the same as Dr. Walensky’s).

I am sure all these statements are true.  But many would find them far more convincing if they would show us the actual numbers.  They could be partial, as noted above, with figures for those known to be vaccinated, known not to be vaccinated, and not known.  But the CDC should make public what it has.  And it would be even more convincing to show the numbers updated daily, to drive home the point, repeatedly, that vaccinations are highly effective in protecting us from suffering and possibly dying from this terrible disease.

C.  The Simple Dynamics of Pandemic Spread

As the chart indicates, the likelihood of becoming a victim of Covid-19 is far less for those vaccinated.  I would stress, however, that one should not jump to the conclusion that if by some miracle all of the US had been vaccinated as of December 27, that the path followed would have been the one shown in the chart for those fully vaccinated.  That would not be the case.  Rather, what the curves show is what the case rates would be for each such group where, as has in fact been the case, a substantial share of the population had not been vaccinated, and hence the virus continued to spread (primarily among the unvaccinated).  With a substantial number of people still infecting others with the virus, a certain number of vaccinated people will still catch the virus as the vaccines, while excellent at an effectiveness of 90% or even higher, are still not 100% effective.

The dynamics would be very different, and far better, if everyone (or even just most Americans) were fully vaccinated.  Indeed, under such a scenario the pandemic would soon end completely, and the line depicting cases of Covid-19 would not simply be low but at zero.

This is due to the mathematics of exponential spread of a virus in a pandemic.  The virus that causes Covid-19 cannot live on its own, but only in a living person.  When a person is exposed to the virus, with viral particles floating into their nose, the virus will take about a week to incubate.  The person is then infectious to others for about another week.  If an infected person spreads the virus on average to two others, then the number of cases will double every reproduction period (one and a half weeks on average for Covid-19).  This reproduction rate is called R0 (or R-naught, or R-zero) by epidemiologists, and refers to the reproduction rate in a setting where no measures are being taken to contain the spread of the virus (no vaccines, no masks, no social distancing).  For the original (pre-mutation) virus that causes Covid-19, the R0 was estimated to be between 2 and 3.  With the more recent – more easily spread – delta variant of the virus, it is believed the R0 is above 3.

A virus that then spreads from one person to three every reproduction period (every week and a half on average for Covid) means that if left unchecked, 100 cases to start (week 0) would grow to 8,100 cases by week 6 and to over 650,000 cases by week 12.  it is tripling every week and a half  However, if everyone has had a vaccine that is 90% effective, then the reproduction rate would be reduced from 3 to 0.3.  That means 100 cases to start would lead to only 30 cases in the next reproduction period and to less than one case by week 6, by which point it will have died out.

However, not everyone is vaccinated.  As of the day I am writing this (July 20), the share of the population fully vaccinated is 49%.  Using, for simplicity, a figure of 50%, and assuming the R0 for the mutations currently circulating in the US is 3, then the reproduction rate would fall not to 0.3 but only to 1.65 (the weighted average of half of the population at 0.3 and half still at 3).  With any reproduction rate above 1.0, the spread of the virus will grow, not diminish.  At a rate of 1.65, one will find by simple arithmetic that 100 cases initially (in week 0) would grow (if nothing else is done) to over 700 cases by week 6 and over 5,000 cases by week 12.  While far less than with a reproduction rate of 3, it is still growing.  And that is indeed what has been happening.

These numbers should be taken as illustrative as the modeling is simplistic.  True modeling would take much more into account.  Simple averages are assumed here, as well as no changes in other factors that affect the trends (in particular no change in the use of masks or social distancing).  Also, simple averages may not work that well for Covid-19, as it appears that some people will be far more contagious than others, plus it will depend on how such individuals behave.  A contagious person might spread the disease to dozens or even hundreds of others if they are in an enclosed hall, with a crowd of others who might be chanting or singing, such as at a political rally or a church service.

But the point here is that if 100% of the population were vaccinated, the curve in a chart showing the case rate among those vaccinated would not follow the curve in the chart at the top of this post.  Rather, it would very quickly drop to zero.  The reason why there are still cases among those vaccinated in the US is that, with only about half of the population vaccinated, the virus continues to spread among the unvaccinated.  It will then spread to a certain share of the vaccinated (about 10% of those who are exposed, for vaccines that are 90% effective).

D.  Why Are Some Not Accepting Vaccination?

Despite the high degree of safety and effectiveness of the vaccines, a significant share of Americans are still refusing to be vaccinated.  As noted at the top of this post, recent Gallup and Washington Post – ABC polls found the almost identical results that close to half (46 or 47%) of Republicans say they are unlikely to (or definitely will not) accept vaccination, while 6% of Democrats (in both polls) said that.

Why this opposition to vaccinations, particularly among Republicans?  It is always hard to discern motives and often the individuals themselves may not really know why they are opposed – they just are.  Surveys may provide an indication, but are limited as the survey questionnaire will typically provide a list of possible reasons and ask the person to check all that might be a factor for them.

For example, a survey conducted by Echelon Insights, a Republican firm, in mid-April asked those surveyed who had said they will not accept a vaccination, or were not sure, to choose from a list of possible reasons for why.  The top responses were (with the percentage saying yes, where one could choose multiple reasons):

a)  The vaccine was developed too quickly:  48%

b)  The vaccine was rushed for political reasons:  39%

c)  I don’t have enough information about side effects:  37%

d)  I don’t trust the information being published about the vaccine:  34%

e)  I’m taking enough measures to avoid Covid-19 without the vaccine:  30%

f)  I wouldn’t trust the vaccine until it’s been in use for at least a few years:  28%

g)  I don’t trust any vaccines:  26%

h)  I wouldn’t trust the vaccine until it’s been in use for at least several months:  20%

i)  I believe I’m personally unlikely to suffer serious long term effects if I contract the coronavirus:  17%

But it is not possible to say the extent to which these were in fact the primary reasons for their hesitancy (or direct opposition) to being vaccinated, and to what extent these were just convenient responses to provide to the person conducting the survey.

There are also more bizarre reasons given by some.  For example, a very recent Economist/YouGov poll (conducted between July 10 and 13) found that among those who say they will not be vaccinated, fully half (51%) said that the vaccinations are being used by the government to inject microchips into the population.  A common variant of this conspiracy theory is that Bill Gates will be using the microchips to monitor and/or control us.  It is hard to believe that half of those refusing to be vaccinated really believe this.  Rather, it appears they have decided they do not want to be vaccinated, and then they come up with various rationalizations.  Consistent with this, the Economist/YouGov poll also found that 85% of those who say they will not be vaccinated believe that the threat of the coronavirus has been exaggerated for political reasons (this despite over 600,000 Americans already having died from Covid).

This opposition has also been fed by such media groups as Fox News, with repeated segments that denigrate vaccination against this disease.  As one example, In early May, Tucker Carlson, the most watched political commentator on Fox News, told his audience that as of April 23, a CDC system had recorded that 3,362 people had died following their vaccination.  His report implied the vaccines caused those deaths, when this was not at all the case.  Numerous fact-checkers and commentators in the media almost immediately investigated and concluded that the Carlson allegations were false (see, for example, here, here, and here).  But the damage had been done.

Tucker Carlson took the figures from the Vaccine Adverse Events Recording System (VAERS), an on-line system set up by the CDC where anyone who had been vaccinated (and indeed anyone else) could make a report if they encountered some adverse event following their vaccination.  It is a voluntary system, open to anyone, and you may have noticed a description of how to use it in the papers you received when you were vaccinated (at least I received it as part of the instructions when I was vaccinated in Washington, DC).  Carlson’s report was that the VAERS showed 3,362 deaths between late December and April 23 (which he then extrapolated to 3,722 as of April 30).

But as the fact-checkers and commentators in the media immediately noted, just because a death was recorded by someone in the VAERS does not mean that the death was caused by the vaccine.  There are a certain number of deaths every year in the US, particularly among the elderly, and one should have taken that into account before jumping to a conclusion that a 3,362 figure (or 3,722 as of end-April) was abnormal and a consequence of the vaccinations they received.

It is straightforward to calculate what the expected number of deaths would be in a normal year for the number of individuals who were vaccinated between late 2020 and the April 30 date that Carlson focussed on.  About 2.9 million Americans died in 2019 (i.e. before any Covid cases or vaccinations), and simply assuming an average mortality rate of those vaccinated, the number that would be expected to die in this period in a normal year would be more than 100,000.  And since those being vaccinated over this period were disproportionately the elderly (as the elderly were prioritized in these early months), the far higher mortality rates of the elderly (compared to the entire population) would lead to a number several times higher.  Some of these deaths were then recorded by someone in the VAERS, but that does not mean the vaccinations caused them.  Indeed, there is no evidence so far that the vaccines have caused any deaths at all (although a very small number are being investigated).

[Technical note for those interested in the details of how this calculation was done:  I used the CDC numbers of those who had been fully vaccinated (as of end-December, 2020, and then daily through to April 30, 2021), the US population (332 million, from the Census Bureau), and the number of deaths in the US in 2019 (i.e. before Covid) of 2.9 million (from the CDC).  From this, one can easily calculate on a spreadsheet the number of person-days (through to April 30) of those vaccinated (i.e. starting at 120 days for those vaccinated as of December 31, and then counting down to zero for those vaccinated on April 30), and take the sum of this.  Dividing this total by the number of person-days in a year for the full US population (i.e. 332 million times 365) yields 3.7%.  Applying this share to the 2.9 million number of deaths in a pre-Covid year means that one would have expected 108,000 of those who had been vaccinated during this period to have died during this period for reasons that had nothing to do with Covid or the vaccines.  And the number dying of normal causes would likely be far higher than this 108,000, as that number is calculated assuming those being vaccinated during this period would have had the average mortality rate of the US as a whole.  But a disproportionate share of those vaccinated during this period were the elderly, as they were given priority, and the elderly will of course have naturally higher mortality rates than the population as a whole.  If one adjusted for the ages of those being vaccinated and then used age-specific mortality rates for these groups, the true number to expect would not be 108,000 but something far higher, and likely several times higher.]

It is not just the media, however.  A number of Republican politicians are saying the same.  And it is not only Republican politicians on the more extreme end of their party (such as Representative Marjorie Taylor Greene of Georgia, where Twitter has just suspended her account for 12 hours due to the misleading information she has posted on Covid-19 and the vaccines for it).  One also sees this among Republican office-holders who have been perceived as coming from the party’s establishment.  A prominent example is Senator Ron Johnson of Wisconsin.  In early May, for example, Senator Johnson also cited the VAERS as indicating “over 3,000” had died following their being vaccinated – implying causation.  And despite being called out on this by fact-checkers, Senator Johnson has continued to make these claims.  The fact-checkers at the Washington Post have recently given Senator Johnson their “highest” rating of four Pinocchios for his ongoing campaign of vaccine misinformation.

It is not clear, however, the extent to which vaccine hesitancy and/or outright opposition originated in the reports and statements of media figures such as Tucker Carlson or political figures such as Senator Ron Johnson, or whether the media and political figures found it advantageous to build on such perceptions and then spur along the concerns.  Which came first is not clear.

What is clear is that the issue of vaccination has become politicized, with vaccination being taken as a sign of political loyalties.  One saw the same politicization with the wearing of masks.  Comparing the share of state populations that have been vaccinated (fully or partially) to the share of the 2020 presidential vote in that state for Trump, one finds:

The correlation is incredibly high, with an R-squared of 0.77.  On average, the regression line indicates that for every additional percentage point in the vote share of Trump, the share of the population in the state that was fully or partially vaccinated (as of July 13) was 0.8 percentage points lower.  Furthermore, almost all of the states that voted for Biden have a higher share vaccinated than all of the states that voted for Trump (with the only significant exception being Georgia, and with a few states where the election was close – Arizona, Wisconsin, Michigan, and Nevada – having similar vaccination shares as some of the higher-end Trump states).

E.  Conclusion

The political division is stark, and it is not clear what might change this.  But with the vaccines so highly effective – against cases of Covid-19, more so against severe cases requiring hospitalization, and even more so against death – releasing hard numbers on what the rates have been among the vaccinated versus the unvaccinated may help.  The numbers currently available might be imperfect, and hence require releases with three categories (vaccinated, unvaccinated, and not known), but this would still tell the story.  If everyone saw each day that 995 out of 1,000 deaths had been among the unvaccinated, with only 5 among those who had been vaccinated (as Dr. Walensky’s 99.5% figure implies), self-interest in one’s own health might eventually win out.

This is obviously urgent.  The chart at the top of this post was based on case data downloaded on July 13 (for Covid-19 cases as of July 12).  As one can see in that chart, the daily number of new cases of Covid-19 has been rising over the last month.  It reached a trough around June 20 (using a 7-day moving average of the daily cases).  By July 12, the number of cases had more than doubled from this trough.  As I write this on July 20, it has increased by a further more than 50% since July 12, so it is now more than triple what it was on June 20.  It is spreading especially rapidly in the states with a relatively low rate of vaccination.

This increase has been due, in part, to new mutations (in particular the delta variant) that spread more rapidly than the original form of the virus.  This is what one would expect from standard evolutionary theory – mutations develop and those that spread more easily will soon dominate.  Adding to this is that social distancing and mask mandates have been sharply eased over the last month, leading many to act as if the virus is no longer a threat.  While that would be true if everyone were vaccinated, and is greatly reduced for those who have been vaccinated, the disease will continue to spread and the threat will remain real when half the population is not.

The Ridership Forecasts for the Baltimore-Washington SCMAGLEV Are Far Too High

The United States desperately needs better public transit.  While the lockdowns made necessary by the spread of the virus that causes Covid-19 led to sharp declines in transit use in 2020, with (so far) only a partial recovery, there will remain a need for transit to provide decent basic service in our metropolitan regions.  Lower-income workers are especially dependent on public transit, and many of them are, as we now see, the “essential workers” that society needs to function.  The Washington-Baltimore region is no exception.

Yet rather than focus on the basic nuts and bolts of ensuring quality services on our subways, buses, and trains, the State of Maryland is once again enamored with using the scarce resources available for public transit to build rail lines through our public parkland in order to serve a small elite.  The Purple Line light rail line was such a case.  Its dual rail lines will serve a narrow 16-mile corridor, passing through some of the richest zip codes in the nation, but destroying precious urban parkland.  As was discussed in an earlier post on this blog, with what will be spent on the Purple Line one could instead stop charging fares on the county-run bus services in the entirety of the two counties the Purple Line will pass through (Montgomery and Prince George’s), and at the same time double those bus services (i.e. double the lines, or double the service frequency, or some combination).

The administration of Governor Hogan of Maryland nonetheless pushed the Purple Line through, although construction has now been halted for close to a year due to cost overruns leading the primary construction contractor to withdraw.  Hogan’s administration is now promoting the building of a superconducting, magnetically-levitating, train (SCMAGLEV) between downtown Baltimore and downtown Washington, DC, with a stop at BWI Airport.  Over $35 million has already been spent, with a massive Draft Environmental Impact Statement (DEIS) produced.  As required by federal law, the DEIS has been made available for public comment, with comments due by May 24.

It is inevitable that such a project will lead to major, and permanent, environmental damage.  The SCMAGLEV would travel partially in tunnels underground, but also on elevated pylons parallel to the Baltimore-Washington Parkway (administered by the National Park Service).  The photos at the top of this post show what it would look like at one section of the parkway.  The question that needs to be addressed is whether any benefits will outweigh the costs (both environmental and other costs), and ridership is central to this.  If ridership is likely to be well less than that forecast, the whole case for the project collapses.  It will not cover its operating and maintenance costs, much less pay back even a portion of what will be spent to build it (up to $17 billion according to the DEIS, but likely to be far more based on experience with similar projects).  Nor would the purported economic benefits then follow.

I have copied below comments I submitted on the DEIS forecasts.  Readers may find them of interest as this project illustrates once again that despite millions of dollars being spent, the consulting firms producing such analyses can get some very basic things wrong.  The issue I focus on for the proposed SCMAGLEV is the ridership forecasts.  The SCMAGLEV project sponsors forecast that the SCMAGLEV will carry 24.9 million riders (one-way trips) in 2045.  The SCMAGLEV will require just 15 minutes to travel between downtown Baltimore and downtown Washington (with a stop at BWI), and is expected to charge a fare of $120 (roundtrip) on average and up to $160 at peak hours.  As one can already see from the fares, at best it would serve a narrow elite.

But there is already a high-speed train providing premier-level service between Baltimore and Washington – the Acela service of Amtrak.  It takes somewhat longer – 30 minutes currently – but its fare is also somewhat lower at $104 for a roundtrip, plus it operates from more convenient stations in Baltimore and Washington.  Importantly, it operates now, and we thus have a sound basis for forecasts of what its ridership might be in the future.

One can thus compare the forecast ridership on the proposed SCMAGLEV to the forecast for Acela ridership (also in the DEIS) in a scenario of no SCMAGLEV.  One would expect the forecasts to be broadly comparable.  One could allow that perhaps it might be somewhat higher on the SCMAGLEV, but probably less than twice as high and certainly less than three times as high.  But one can calculate from figures in the DEIS that the forecast SCMAGLEV ridership in 2045 would be 133 times higher than what they forecast Acela ridership would be in that year (in a scenario of no SCMAGLEV).  For those going just between downtown Baltimore and downtown Washington (i.e. excluding BWI travelers), the forecast SCMAGLEV ridership would be 154 times higher than what it would be on the comparable Acela.  This is absurd.

And it gets worse.  For reasons that are not clear, the base year figures for Acela ridership in the Baltimore-Washington market are more than eight times higher in the DEIS than figures that Amtrak itself has produced.  It is possible that the SCMAGLEV analysts included Acela riders who have boarded north of Baltimore (such as in Philadelphia or New York) and then traveled through to DC (or from DC would pass through Baltimore to ultimate destinations further north).  But such travelers should not be included, as the relevant travelers who might take the SCMAGLEV would only be those whose trips begin in either Baltimore or in Washington and end in the other metropolitan area.  The project sponsors have made no secret that they hope eventually to build a SCMAGLEV line the full distance between Washington and New York, but that would at a minimum be in the distant future.  It is not a source of riders included in their forecasts for a Baltimore to Washington SCMAGLEV.

The Amtrak forecasts of what it expects its Acela ridership would be, by market (including between Baltimore and Washington) and under various investment scenarios, come from its recent NEC FUTURE (for Northeast Corridor Future) study, for which it produced a Final Environmental Impact Statement.  Using Amtrak’s forecasts of what its Acela ridership would be in a scenario where major investments allowed the Acela to take just 20 minutes to go between Baltimore and Washington, the SCMAGLEV ridership forecasts were 727 times as high (in 2040).  That is complete nonsense.

My comment submitted on the DEIS, copied below, goes further into these results and discusses as well how the SCMAGLEV sponsors could have gotten their forecasts so absurdly wrong.  But the lesson here is that the consultants producing such forecasts are paid by project sponsors who wish to see the project built.  Thus they have little interest in even asking the question of why they have come up with an estimate that 24.9 million would take a SCMAGLEV in 2045 (requiring 15 minutes on the train itself to go between Baltimore and DC) while ridership on the Acela in that year (in a scenario where the Acela would require 5 minutes more, i.e. 20 minutes, and there is no SCMAGLEV) would be about just 34,000.

One saw similar issues with the Purple Line.  An examination of the ridership forecasts made for it found that in about half of the transit analysis zone pairs, the predicted ridership on all forms of public transit (buses, trains, and the Purple Line as well) was less than what they forecast it would be on the Purple Line only.  This is mathematically impossible.  And the fact that half were higher and half were lower suggests that the results they obtained were basically just random.  They also forecast that close to 20,000 would travel by the Purple Line into Bethesda each day but only about 10,000 would leave (which would lead to Bethesda’s population exploding, if true).  The source of this error was clear (they mixed up two formats for the trips – what is called the production/attraction format with origin/destination), but it mattered.  They concluded that the Purple Line had to be a rail line rather than a bus service in order to handle their predicted 20,000 riders each day on the segment to Bethesda.

It may not be surprising that private promoters of such projects would overlook such issues.  They may stand to gain (i.e. from the construction contracts, or from an increase in land values next to station sites), even though society as a whole loses.  Someone else (government) is paying.  But public officials in agencies such as the Maryland Department of Transportation should be looking at what is the best way to ensure quality and affordable transit services for the general public.  Problems develop once the officials see their role as promoters of some specific project.  They then seek to come up with a rationale to justify the project, and see their role as surmounting all the hurdles encountered along the way.  They are not asking whether this is the best use of scarce public resources to address our very real transit needs.

A high-speed magnetically-levitating train (with superconducting magnets, no less), may look attractive.  But officials should not assume such a shiny new toy will address our transit issues.


May 22, 2021

Comment Submitted on the DEIS for SCMAGLEV

The Ridership Forecasts Are Far Too High

A.  Introduction

I am opposed to the construction of the proposed SCMAGLEV project between Baltimore and Washington, DC.  A key issue for any such system is whether ridership will be high enough to compensate for the environmental damage that is inevitable with such a project.  But the ridership forecasts presented in the DEIS are hugely flawed.  They are far too high and simply do not meet basic conditions of plausibility.  At more plausible ridership levels, the case for such a project collapses.  It will not cover its operating costs, much less pay back any of the investment (of up to $17 billion according to the DEIS, but based on experience likely to be far higher).  Nor will the purported positive economic benefits then follow.  But the damage to the environment will be permanent.

Specifically, there is rail service now between Baltimore and Washington, at three levels of service (the high-speed Acela service of Amtrak, the regular Amtrak Regional service, and MARC).  Ridership on the Acela service, as it is now and with what is expected with upgrades in future years, provides a benchmark that can be used.  While it could be argued that ridership on the proposed SCMAGLEV would be higher than ridership on the Acela trains, the question is how much higher.  I will discuss below in more detail the factors to take into account in making such a comparison, but briefly, the Acela service takes 30 minutes today to go between Baltimore and Washington, while the SCMAGLEV would take 15 minutes.  But given that it also takes time to get to the station and on the train, and then to the ultimate destination at the other end, the time savings would be well less than 50%.  The fare would also be higher on the SCMAGLEV (at an average, according to the DEIS, of $120 for a round-trip ticket but up to $160 at peak hours, versus an average of $104 on the Acela).  In addition, the stations the SCMAGLEV would use for travel between downtown Baltimore and downtown Washington are less conveniently located (with poorer connections to local transit) than the Acela uses.

Thus while it could be argued that the SCMAGLEV would attract more riders than the Acela, even this is not clear.  But being generous, one could allow that it might attract somewhat more riders.  The question is how many.  And this is where it becomes completely implausible.  Based on the ridership forecasts in the DEIS, for both the SCMAGLEV and for the Acela (in a scenario where the SCMAGLEV is not built), the SCMAGLEV in 2045 would carry 133 times what ridership would be on the Acela.  Excluding the BWI ridership on both, it would be 154 times higher.  There is no way to describe this other than that it is just nonsense.  And with other, likely more accurate, forecasts of what Acela ridership would be in the future (discussed below) the ratios become higher still.

Similarly, if the SCMAGLEV will be as attractive to MARC riders as the project sponsors forecast it will be, then most of those MARC riders would now be on the modestly less attractive Acela.  But they aren’t.  The Acela is 30 minutes faster than MARC (the SCMAGLEV would be 45 minutes faster), yet 28 times as many riders choose MARC over Acela between Baltimore and Washington.  I suspect the fare difference ($16 per day on MARC, vs. $104 on the Acela) plays an important role.  The model used could have been tested by calculating a forecast with their model of what Acela ridership would be under current conditions, with this then compared this to what the actual figures are.  Evidently this was not done.  Had they, their predicted Acela ridership would likely have been a high multiple of the actual and it would have been clear that their modeling framework has problems.

Why are the forecasts off by orders of magnitude?  Unfortunately, given what has been made available in the DEIS and with the accompanying papers on ridership, one cannot say for sure.  But from what has been made available, there are indications of where the modeling approach taken had issues.  I will discuss these below.

In the rest of this comment I will first discuss the use of Acela service and its ridership (both the actual now and as projected) as a basis for comparison to the ridership forecasts made for the SCMAGLEV.  They would be basically similar services, where a modest time saving on the SCMAGLEV (15 minutes now, but only 5 minutes in the future if further investments are made in the Acela service that would cut its Baltimore to DC time to just 20 minutes) is offset by a higher fare and less convenient station locations.  I will then discuss some reasons that might explain why the SCMAGLEV ridership forecasts are so hugely out-of-line with what plausible numbers might be.

B.  A Comparison of SCMAGLEV Ridership Forecasts to Those for Acela  

The DEIS provides ridership forecasts for the SCMAGLEV for both 2030 (several years after the DEIS says it would be opened, so ridership would then be stable after an initial ramping up) and for a horizon year of 2045.  I will focus here on the 2045 forecasts, and specifically on the alternative where the destination station in Baltimore is Camden Yards.  The DEIS also has forecasts for ridership in an alternative where the SCMAGLEV line would end in the less convenient Cherry Hill neighborhood of Baltimore, which is significantly further from downtown and with poorer connections to local transit options.  The Camden Yards station is more comparable to Penn Station – Baltimore, which the Acela (and Amtrak Regional trains and one of the MARC lines) use.  Penn Station – Baltimore has better local transit connections and would be more convenient for many potential riders, but this will of course depend on the particular circumstances of the rider – where he or she will be starting from and where their particular destination will be.  It will, in particular, be more convenient for riders coming from North and Northeast of Baltimore than Camden Yards would be.  And those from South and Southwest of Baltimore would be more likely to drive directly to the DC region than try to reach Camden Yards, or they would alight at BWI.

The DEIS also provides forecasts of what ridership would be on the existing train services between Baltimore and Washington:  the Acela services (operated by Amtrak), the regular Amtrak Regional trains, and the MARC commuter service operated by the State of Maryland.  Note also that the 2045 forecasts for the train services are for both a scenario where the SCMAGLEV is not built and then what they forecast the reduced ridership would be with a SCMAGLEV option.  For the purposes here, what is of interest is the scenario with no SCMAGLEV.

The SCMAGLEV would provide a premium service, requiring 15 minutes to go between downtown Baltimore and downtown Washington, DC.  Acela also provides a premium service and currently takes 30 minutes, while the regular Amtrak Regional trains take 40 to 45 minutes and MARC service takes 60 minutes.  But the fares differ substantially.  Using the DEIS figures (with all prices and fares expressed in base year 2018 dollars), the SCMAGLEV would charge an average fare of $120 for a round-trip (Baltimore-Washington), and up to $160 for a roundtrip at peak times.  The Acela also has a high fare for its also premium service, although not as high as SCMAGLEV, charging an average of $104 for a roundtrip (using the DEIS figures).  But Amtrak Regional trains charge only $34 for a similar roundtrip, and MARC only $16.

Acela service thus provides a reasonable basis for comparison to what SCMAGLEV would provide, with the great advantage that we know now what Acela ridership has actually been.  This provides a firm base for a forecast of what Acela ridership would be in a future year in a scenario where the SCMAGLEV is not built.  And while the ridership on the two would not be exactly the same, one should expect them to be in the same ballpark.

But they are far from that:

  DEIS Forecasts of SCMAGLEV vs. Acela Ridership, Annual Trips in 2045



Acela Trips


Baltimore – DC only



154 times as much

All, including BWI



133 times as much

Sources:  DEIS, Main Report Table 4.2-3; and Table D-4-48 of Appendix D.4 of the DEIS

Using estimates just from the DEIS, the project sponsor is forecasting that annual (one-way) trips on the SCMAGLEV in 2045 would be 133 times what they would be in that year on the Acela (in a scenario where the SCMAGLEV is not built).  And it would be 154 times as much for the Baltimore – Washington riders only.  This is nonsense.  One could have a reasonable debate if the SCMAGLEV figures were twice as high, and maybe even if they were three times as high.  But it is absurd that they would be 133 or 154 times as high.

And it gets worse.  The figures above are all taken from the DEIS.  But the base year Acela ridership figures in the DEIS (Appendix D.4, Table D.4-45) differ substantially from figures Amtrak itself has produced in its recent NEC FUTURE study.  This review of future investment options in Northeast Corridor (Washington to Boston) Amtrak service was concluded in July 2017.  As part of this it provided forecasts of what future Acela ridership would be under various alternatives, including one (its Alternative 3) where Acela trains would be substantially upgraded and require just 20 minutes for the trip between downtown Baltimore and downtown Washington, DC.  This would be quite similar to what SCMAGLEV service would be.

But for reasons that are not clear, the base year figures for Acela ridership between Baltimore and Washington differ substantially between what the SCMAGLEV DEIS has and what NEC FUTURE has.  The figure in the NEC FUTURE study (for a base year of 2013) puts the number of riders (one-way) between Baltimore and Washington (and not counting those who boarded north of Baltimore, at Philadelphia or New York for example, and then rode through to Washington, and similarly for those going from Washington to Baltimore) at just 17,595.  The DEIS for the SCMAGLEV put the similar Acela ridership (for a base year of 2017) at 147,831 (calculated from Table D.4-45, of Appendix D.4).  While the base years differ (2013 vs. 2017), the disparity cannot be explained by that.  It is far too large.  My guess would be that the DEIS counted all Acela travelers taking up seats between Baltimore and Washington, including those who alighted north of Baltimore (or whose destination from Washington was north of Baltimore), and not just those travelers traveling solely between Washington and Baltimore.  But the SCMAGLEV will be serving only the Baltimore-Washington market, with no interconnections with the train routes coming from north of Baltimore.

What was the source of the Acela ridership figure in the DEIS of 147,831 in 2017?  That is not clear.  Table D.4-45 of Appendix D.4 says that its source is Table 3-10 of the “SCMAGLEV Final Ridership Report”, dated November 8, 2018.  But that report, which is available along with the other DEIS reports (with a direct link at https://bwmaglev.info/index.php/component/jdownloads/?task=download.send&id=71&catid=6&m=0&Itemid=101), does not have a Table 3-10.  Significant portions of that report were redacted, but in its Table of Contents no reference is shown to a Table 3-10 (even though other redacted tables, such as Tables 5-2 and 6-3, are still referenced in the Table of Contents, but labeled as redacted).

One can only speculate on why there is no Table 3-10 in the Final Ridership Report.  Perhaps it was deleted when someone discovered that the figures reported there, which were then later used as part of the database for the ridership forecast models, were grossly out of line with the Amtrak figures.  The Amtrak figure for Acela ridership for Baltimore-Washington passengers of 17,595 (in 2013) is less than one-eighth of the figure on Acela ridership shown in the DEIS or 147,831 (in 2017).

It can be difficult for an outsider to know how many of those riding on the Acela between Washington and Baltimore are passengers going just between those two cities (as well as BWI).  Most of the passengers riding on that segment will be going on to (or coming from) cities further north.  One would need access to ticket sales data.  But it is reasonable to assume that Amtrak itself would know this, and therefore that the figures in the NEC FUTURE study would likely be accurate.  Furthermore, in the forecast horizon years, where Amtrak is trying to show what Acela (and other rail) ridership would grow to with alternative investment programs, it is reasonable to assume that Amtrak would provide relatively optimistic (i.e. higher) estimates, as higher estimates are more likely to convince Congress to provide the funding that would be required for such investments.

The Amtrak figures would in any case provide a suitable comparison to what SCMAGLEV’s future ridership might be.  The Amtrak forecasts are for 2040, so for the SCMAGLEV forecasts I interpolated to produce an estimate for 2040 assuming a constant rate of growth between the forecast SCMAGLEV ridership in 2030 and that for 2045.  Both the NEC FUTURE and SCMAGLEV figures include the stop at BWI.

    Forecasts of SCMAGLEV (DEIS) vs. Acela (NEC FUTURE) Ridership between Baltimore and Washington, Annual Trips in 2040 



Acela Trips


No Action



870 times as much

Alternative 1



850 times as much

Alternative 2



780 times as much

Alternative 3



727 times as much

Sources:  SCMAGLEV trips interpolated from figures on forecast ridership in 2030 and 2045 (Camden Yards) in Table 4.2-3 of DEIS.  Acela trips from NEC FUTURE Final EIS, Volume 2, Appendix B.08.

The Acela ridership figures are those estimated under various investment scenarios in the rail service in the Northeast Corridor.  NEC FUTURE examined a “No Action” scenario with just minimal investments, and then various alternative investment levels to produce increasingly capable services.  Alternative 3 (of which there were four sub-variants, but all addressing alternative investments between New York and Boston and thus not affecting directly the Washington-Baltimore route) would upgrade Acela service to the extent that it would go between Baltimore and Washington in just 20 minutes.  This would be very close to the 15 minutes for the SCMAGLEV.  Yet even with such a comparable service, the SCMAGLEV DEIS is forecasting that its service would carry 727 times as many riders as what Amtrak has forecast for its Acela service (in a scenario where there is no SCMAGLEV).  This is complete nonsense.

To be clear, I would stress again that the forecast future Acela ridership figures are a scenario under various possible investment programs by Amtrak.  The investment program in Alternative 3 would upgrade Acela service to a degree where the Baltimore – Washington trip (with a stop at BWI) would take just 20 minutes.  The NEC FUTURE study forecasts that in such a scenario the Baltimore-Washington ridership on Acela would total a bit over 31,000 trips in the year 2040.  In contrast, the DEIS for the SCMAGLEV forecasts that there would in that year be close to 23 million trips taken on the similar SCMAGLEV service, requiring 15 minutes to make such a trip.  Such a disparity makes no sense.

C.  How Could the Forecasts be so Wrong?

A well-known consulting firm, Louis Berger, prepared the ridership forecasts, and their “Final Ridership Report” dated November 8, 2018, referenced above, provides an overview on the approach they took.  Unfortunately, while I appreciate that the project sponsor provided a link to this report along with the rest of the DEIS (I had asked for this, having seen references to it in the DEIS), the report that was posted had significant sections redacted.  Due to those redactions, and possibly also limitations in what the full report itself might have included (such as summaries of the underlying data), it is impossible to say for sure why the forecasts of SCMAGLEV ridership were close to three orders of magnitude greater than what ridership has been and is expected to be on comparable Acela service.

Thus I can only speculate.  But there are several indications of what may have led the SCMAGLEV estimates to be so out of line with ridership on a service that is at least broadly comparable.  Specifically:

1)  As noted above, there were apparent problems in assembling existing data on rail ridership for the Baltimore-Washington market, in particular for the Acela.  The ridership numbers for the Acela in the DEIS were more than eight times higher in their base year (2017) than what Amtrak had in an only slightly earlier base year (2013).  The ridership numbers on Amtrak Regional trains (for Baltimore-Washington riders) were closer but still substantially different:  409,671 in Table D.4-45 of the DEIS (for 2017), vs. 172,151 in NEC FUTURE (for 2013).

Table D.4-45 states that its source for this data on rail ridership is a Table 3-10 in the Final Ridership Report of November 8, 2018.  But as noted previously, such a table is not there – it was either never there or it was redacted.  Thus it is impossible to determine why their figures differ so much from those of Amtrak.  But the differences for the Acela figures (more than a factor of eight) are huge, i.e. close to an order of magnitude by itself.  While it is impossible to say for sure, my guess (as noted above) is that the Acela ridership numbers in the DEIS included travelers whose trip began, or would end, in destinations north of Baltimore, who then traveled through Baltimore on their way to, or from, Washington, DC.  But such travelers are not part of the market the SCMAGLEV would serve.

2)  In modeling the choice those traveling between Baltimore and Washington would have between SCMAGLEV and alternatives, the analysts collapsed all the train options (Acela, Amtrak Regional, and MARC) into one.  See page 61 of the Ridership Report.  They create a weighted average for a single “train” alternative, and they note that since (in their figures) MARC ridership makes up almost 90% of the rail market, the weighted averages for travel time and the fare will be essentially that of MARC.

Thus they never looked at Acela as an alternative, with a service level not far from that of SCMAGLEV.  Nor do they even consider the question of why so many MARC riders (67.5% of MARC riders in 2045 if the Camden Yards option is chosen – see page D-56 of Appendix D-4 of the DEIS) are forecast to divert to the SCMAGLEV, but are not doing so now (nor in the future) to Acela.  According to Table D-45 of Appendix D.4 of the DEIS, in their data for their 2017 base year, there are 28 times as many MARC riders as on Acela between downtown Baltimore and downtown Washington, and 20 times as many with those going to and from the BWI stop included.  Evidently, they do not find the Acela option attractive.  Why should they then find the SCMAGLEV train attractive?

3)  The answer as to why MARC riders have not chosen to ride on the Acela almost certainly has something to do with the difference in the fares.  A round-trip on MARC costs $16 a day.  A round trip on Acela costs, according to the DEIS, an average of $104 a day.  That is not a small difference.  For someone commuting 5 days a week and 50 weeks a year (or 250 days a year), the annual cost on MARC would be $4,000 but $26,000 a year on the Acela.  And it would be an even higher $30,000 a year on the SCMAGLEV (based on an average fare of $120 for a round trip), and $40,000 a year ($160 a day) at peak hours (which would cover the times commuters would normally use).  Even for those moderately well off, $40,000 a year for commuting would be a significant expense, and not an attractive alternative to MARC with its cost of just one-tenth of this.

If such costs were properly taken into account in the forecasting model, why did it nonetheless predict that most MARC riders would switch to the SCMAGLEV?  This is not fully clear as the model details were not presented in the redacted report, but note that the modelers assigned high dollar amounts for the time value of money ($31.00 to $46.50 for commuters and other non-business travel, and $50.60 to $75.80 for business travel – see page 53 of the Ridership Report).  However, even at such high values, the numbers do not appear to be consistent.  Taking a SCMAGLEV (15 minute trip) rather than MARC (60 minutes) would save 45 minutes each way or 1 1/2 hours a day.  Only at the very high end value of time for business travelers (of $75.80 per hour, or $113.70 for 1 1/2 hours) would this value of time offset the fare difference of $104 (using the average SCMAGLEV fare of $120 minus the MARC fare of $16).  And even that would not suffice for travelers at peak hours (with its SCMAGLEV fare of $160).

But there is also a more basic problem.  It is wrong to assume that travelers on MARC treat their 60 minutes on the train as all wasted time.  They can read, do some work, check their emails, get some sleep, or plan their day.  The presumption that they would pay amounts similar to what some might on average earn in an hour based on their annual salaries is simply incorrect.  And as noted above, if it were correct, then one would see many more riders on the Acela than one does (and similarly riders on the Amtrak Regional trains, that require about 40 minutes for the Washington to Baltimore trip, with an average fare of $34 for a round trip).

There is a similar issue for those who drive.  Those who drive do not place a value on the time spent in their cars equal to what they would earn in an hourly equivalent of their regular salary.  They may well want to avoid traffic jams, which are stressful and frustrating for other reasons, but numerous studies have found that a simple value-of-time calculation based on annual salaries does not explain why so many commuters choose to drive.

4)  Data for the forecasting model also came in part from two personal surveys.  One was an in-person survey of travelers encountered on MARC, at either the MARC BWI Station or onboard Penn Line trains, or at BWI airport.  The other was an online internet survey, where they unfortunately redacted out how they chose possible respondents.

But such surveys are unreliable, with answers that depend critically on how the questions are phrased.  The Final Ridership report does not include the questionnaire itself (most such reports would), so one cannot know what bias there might have been in how the questions were worded.  As an example (and admittedly an exaggerated example, to make the point) were the MARC riders simply asked whether they would prefer a much faster, 15 minute, trip?  Or were they asked whether they would pay an extra $104 per day ($144 at peak hours) to ride a service that would save them 45 minutes each way on the train?

But even such willingness to pay questions are notoriously unreliable.  An appropriate follow-up question to a MARC rider saying they would be willing to pay up to an extra $144 a day to ride a SCMAGLEV, would be why are they evidently not now riding the Acela (at an extra $88 a day) for a ride just 15 minutes longer than what it would be on the SCMAGLEV.

One therefore has to be careful in interpreting and using the results from such a survey in forecasting how travelers would behave.  If current choices (e.g. using the MARC rather than the Acela) do not reflect the responses provided, one should be concerned.

5)  Finally, the particular mathematical form used to model the choices the future travelers would make can make a big difference to the findings.  The Final Ridership Report briefly explains (page 53) that it used a multinomial logit model as the basis for its modeling.  Logit functions assign a continuous probability (starting from 0 and rising to 100%) of some event occurring.  In this model, the event is that a traveler going from one travel zone to another will choose to travel via the SCMAGLEV, or not.  The likelihood of choosing to travel via the SCMAGLEV will be depicted as an S-shaped function, starting at zero and then smoothly rising (following the S-shape) until it reaches 100%, depending on, among other factors, what the travel time savings might be.

The results that such a model will predict will depend critically, of course, on the particular parameters chosen.  But the heavily redacted Final Ridership Report does not show what those parameters were nor how they were chosen or possibly estimated, nor even the complete set of variables used in that function.  The report says little (in what remains after the redactions) beyond that they used that functional form.

A feature of such logit models is that while the choices are discrete (one either will ride the SCMAGLEV or will not), it allows for “fuzziness” around the turning points, that recognize that between individuals, even if they confront a similar combination of variables (a combination of cost, travel time, and other measured attributes), some will simply prefer to drive while some will prefer to take the train.  That is how people are.  But then, while a higher share might prefer to take a train (or the SCMAGLEV) when travel times fall (by close to 45 minutes with the SCMAGLEV when compared to their single “train” option that is 90% MARC, and by variable amounts for those who drive depending on the travel zone pairs), how much higher that share will be will depend on the parameters they selected for their logit.

With certain parameters, the responses can be sensitive to even small reductions in travel times, and the predicted resulting shifts then large.  But are those parameters reasonable?  As noted previously, a test would have been whether the model, with the parameters chosen, would have predicted accurately the number of riders actually observed on the Acela trains in the base year.  But it does not appear such a test was done.  At least no such results were reported to test whether the model was validated or not.

Thus there are a number of possible reasons why the forecast ridership on the SCMAGLEV differs so much from what one currently observes for ridership on the Acela, and from what one might reasonably expect Acela ridership to be in the future.  It is not possible to say whether these are indeed the reasons why the SCMAGLEV forecasts are so incredibly out of line with what one observes for the Acela.  There may be, and indeed likely are, other reasons as well.  But due to issues such as those outlined here, one can understand the possible factors behind SCMAGLEV ridership forecasts that deviate so markedly from plausibility.

D.  Conclusion

The ridership forecasts for the SCMAGLEV are vastly over-estimated.  Predicted ridership on the SCMAGLEV is a minimum of two, and up to three, orders of magnitude higher than what has been observed on, and can reasonably be forecast for, the Acela.  One should not be getting predicted ridership that is more than 100 times what one observes on a comparable, existing (and thus knowable), service.

With ridership on the proposed system far less than what the project sponsors have forecast, the case for building the SCMAGLEV collapses.  Operational and maintenance costs would not be covered, much less any possibility of paying back a portion of the billions of dollars spent to build it, nor will the purported economic benefits follow.

However, the harm to the environment will have been done.  Even if the system is then shut down (due to the forecast ridership never materializing), it will not be possible to reverse much of that environmental damage.

The US very much needs to improve its public transit.  It is far too difficult, with resulting harm both to the economy and to the population, to move around in the Baltimore-Washington region.  But fixing this will require a focus on the basic nuts and bolts of operating, maintaining, and investing in the transit systems we have, including the trains and buses.  This might not look as attractive as a magnetically levitating train, but will be of benefit.  And it will be of benefit to the general public – in particular to those who rely on public transit – and not just to a narrow elite that can afford $120 fares.  Money for public transit is scarce.  It should not be wasted on shiny new toys.

Jobs Due to Biden’s Infrastructure Plan: What is Being Discussed is Not What You Think

A.  Introduction

Politicians have always been eager to announce that a program they have proposed will “create jobs”.  The Biden administration is no exception.  Indeed, President Biden has titled his $2.2 trillion proposal to rebuild America’s infrastructure the “American Jobs Plan”.  And all this is understandable, given the politics.  You would be forgiven, however, for assuming that what is being discussed on the additional jobs that would follow from Biden’s infrastructure proposals has something to do with jobs such as those depicted in the picture above.  They don’t.  The numbers on “new jobs created” that are being bandied about are on something else entirely.

There has also been some confusion on how many jobs that might be.  In remarks made on April 2, soon after his initial announcement of the proposed $2.2 trillion infrastructure initiative, Biden said:  “Independent analysis shows that if we pass this plan, the economy will create 19 million jobs — good jobs, blue-collar jobs, jobs that pay well.”  The estimate is from an analysis made by Mark Zandi, Chief Economist of Moody’s Analytics (a subsidiary of Moody’s, the bond credit rating agency).  Zandi is a well-respected economist, who was an economic advisor to John McCain during his 2008 campaign for the presidency and who has advised both Democrats and Republicans.

The 19 million jobs figure is an estimate made by Zandi and his team at Moody’s Analytics of how many more jobs there would be in the US (or, more precisely, non-farm employees) in 2030 as compared to the average number in 2020, in a scenario where Biden’s infrastructure plan is approved as proposed and then implemented.  But it is important to note that this is an estimate of the total number of jobs that “the economy will create” over the decade if the plan is passed (which is what Biden specifically said), and not an estimate of the extra number of jobs that can be attributed to the American Jobs Plan itself.  But it would be easy to miss this distinction.  The Moody’s Analytics estimates are that the number of jobs in the economy would rise between 2020 and 2030 by 19.0 million if the plan is passed as proposed, but by 16.3 million if only the covid-relief plan (Biden’s $1.9 trillion American Rescue Plan) is passed (as it has been), and by 15.7 million in a scenario where neither plan was passed.  Thus in the Moody’s Analytics forecasts, the number of jobs in 2030 would be 2.7 million higher than otherwise if the infrastructure plan is now passed (on top of the extra 0.6 million if only the covid-relief plan were passed).

But it is easy to misstate these distinctions, and some of the administration appointees discussing the proposal with the press at first did so.  In particular, Pete Buttigieg, the Transportation Secretary, and Brian Deese, the head of the National Economic Council in the White House, at first used wording that implied that the full 19 million additional jobs would be due to the infrastructure plan itself.  They later clarified that they had misspoke, and that the Moody’s Analytics estimates were of 2.7 million additional jobs due to the infrastructure plan.  However, this did not keep various news media fact-checkers (including at CNN and at the Washington Post) from taking them to task on it (and for the Washington Post to award Biden “two Pinocchios” in their fact-checking scoring system for being, in their view, misleading).

One can question whether this is quibbling over language that was not fully clear.  But what is of far greater importance is that it misses the fundamental question of what any of these employment forecasts (whether of 19 million, or 2.7 million, or 0.6 million from the $1.9 trillion covid-relief plan) actually mean.  Keep in mind that they are all estimates of how many more people will be employed in 2030 compared to the number employed in 2020, or in a comparison of one scenario for 2030 compared to another.  They are specifically not estimates of the number of jobs of primarily construction workers who would be employed as a direct result of the new infrastructure investments being built.  Yet the wording of Biden, stating that these would be well-paying blue-collar jobs, would appear to indicate that that is what he had in mind when citing the figures.

Furthermore, if the job figures were intended to refer to the blue-collar construction workers who would be hired to build these projects, it does not make much sense to base a comparison on 2030.  By that point the infrastructure plan would be essentially over, with just a small residual amount still to be spent as the program is tailing off (of the $2.2 trillion total, just $81 billion in 2030 and a final $35 billion in 2031 would remain to be spent in the Moody’s estimates).  Few construction workers would still be employed on those projects by that point.  Rather, what may be of interest is not some relatively small change in the overall number of people employed at some end-point, but rather the number of person-years of employment of such workers during the full period of the infrastructure plan.  But the Moody’s estimates are specifically not that.

This then brings up the question of what is Moody’s in fact estimating?  That will be the focus of this blog post.  It is not the number of jobs in construction that will be created as a result of the new work on infrastructure, as these will be down to a fairly minor level by 2030.  As we will see, it is rather an estimate resulting from some secondary aspects of the Moody’s model, and it is not even clear whether the differences were intended to be meaningful.

To start, this post will review how estimates of future employment are traditionally made – for example by the Bureau of Labor Statistics (BLS).  In brief, they are based on population estimates and on forecasts of what share of different population groups will seek to be part of the labor force (the labor force participation rates), with then the assumption that the economy will be at full employment at that future date.  The full employment assumption is made not because the forecaster is confident the economy will in fact be at full employment in that forecast year.  Rather, they do not really know what the short-term conditions will be in that future year, and assuming full employment is just for setting a benchmark.  Unemployment depends on how successful monetary and fiscal policies would have been in that future year to bring the economy to full employment.  Such policies are short-term, depend on the immediate situation, and we have no way of knowing now (in 2021) what shocks or surprises the economy will be facing in 2030.

With this the case, why is Moody’s forecasting any difference at all in the 2030 employment numbers?  The differences are in fact not large when compared to what overall employment will be in that year.  But there is some, and we will discuss why that is.

The post will then look at what one might say on jobs in the intervening years.  While Moody’s has produced year-by-year estimates, its approach for those years (after the next couple of years, as they forecast the economy moves to full employment) is fundamentally similar to what they assume for 2030.  What Moody’s specifically did not do in its analysis was try to estimate the direct number of jobs (or more precisely, person-years of employment) of those employed on the infrastructure projects in Biden’s plan.  Someone will likely do that at some point, but it was not done here.  The question I will then look at it is whether this should be seen as “job creation”.  I will argue that it would be more appropriate to look at it as job shifting rather than job creation, as the total number of jobs in the economy (the number employed) will likely not be all that much different.  And there is nothing wrong with that.  The primary objective, after all, is to build and maintain our badly needed infrastructure.  And on the employment that would follow, providing more attractive jobs that workers will seek to shift into is a good thing.  But the total number employed may not change, and if that is the metric one tries to use, one will likely be disappointed.  Many, including politicians, are often confused about this.

None of this should be taken to imply that the infrastructure plan is not warranted.  It desperately is, as will be discussed in the penultimate section of this post.  The US has underinvested in public infrastructure for decades, and what we have is an embarrassment compared to what is seen in Europe or East Asia.  And it has direct implications for productivity.  Truck drivers are not productive when they are sitting in traffic jams due to our poor highways.  But it is wrong to assess the value of an infrastructure investment program by some estimate of the number of jobs created.  Yes, there will be workers employed on the projects, in likely well-paid jobs.  But that should not be the objective – better public infrastructure should be the objective, achieved as efficiently as possible.  A focus on “jobs created” is instead likely to lead to confusion, as it has with the Moody’s numbers.

We will then end with a short summary and conclusions section.

Finally, note that the version of Biden’s infrastructure plan examined by Zandi and his team was estimated to cost $2.2 trillion over ten years.  However, one will see references to Biden’s plan as costing $2.0 trillion, or $2.3 trillion, or some other amount.  The final amount will depend, of course, on whatever Congress approves, but for consistency I will focus here on the plan as assessed by Zandi, at an estimated cost of $2.2 trillion.

B.  Forecasting Future Employment Levels

Yogi Berra purportedly said:  “It’s tough to make predictions, especially about the future”.  Whether he actually said that is not so clear, but it is certainly true.  And this is especially true of predictions of future employment.  But some things are more predictable than others, and the trick is to make use of factors that change only slowly over time.

In particular, population forecasts for periods of a decade or so are relatively reliable.  Those in a particular age bracket now will be ten years older a decade from now, and all one needs then to adjust for are mortality rates (which are known and change only slowly over time) and net migration rates (which are relatively small in magnitude).  Thus the Census Bureau can produce fairly reliable population forecasts for periods of a decade, and can provide these for groups broken down by age bracket as well as sex, race, and ethnicity.

The Bureau of Labor Statistics starts from such Census Bureau forecasts to produce its projections of the labor force and employment.  The BLS does this annually, with the most recent such projections from September 2000 covering the period 2019 to 2029.  The BLS takes the Census Bureau forecasts for the adult population (age 16 and above), with these broken up into age groups (mostly 10-year groups, i.e. aged 25 to 34, 35 to 44, etc.) and by sex, with overriding checks based on race (white, black, other) and ethnic (Hispanic and non-Hispanic) classifications.  For each of these groups, it estimates, based on a statistical analysis of historical trends, what its labor force participation rate can be expected to be in the projection year.  The labor force participation rate is the share of the population within each group who choose to be part of the labor force (i.e. either employed or, if unemployed, seeking a job).  Labor force participation rates change only slowly over time (as was discussed in this earlier post on this blog), so this is a reasonable approach for estimating what the labor force might be in a decade’s time.

Employment will then be the labor force minus the number who are unemployed.  But there is no way to know beyond the next few years what the unemployment rate might then be.  It will depend on what shocks or surprises there might have been to the economy at that time, and these are by definition not predictable.  If they were, they would not be surprises.  While active monetary and fiscal policy would then seek to bring unemployment down to just frictional levels, how long this will take depends on many factors, including political ones.  And the problem is one that can only be addressed in the near term, as it depends on when the shock came. Thus the Fed’s Board of Governors meets as a group every six weeks throughout the year to monitor the situation, and to decide based on what they know at the time whether to tweak monetary policy through some instrument (normally short-term interest rates, which they may adjust up or, when they can, down, to affect growth).

There is thus no way to know now, in 2021, what the rate of unemployment will be in 2030.  For this reason, to set a benchmark to which comparisons under different scenarios can be made, the BLS and others following this approach assume the economy will be operating at full employment in that projection year.  That is, the benchmark sets unemployment at some specific, low, rate to reflect just frictional unemployment.  While there has been debate on what that specific rate might be (different analysts generally peg it at between 4 and 5% currently), a specific rate would be chosen for the comparisons.  Employment will then be equal to the labor force in that forecast year minus the number unemployed at this assumed rate of unemployment.

[MInor technical note:  The employment figure arrived at in this way will be employment as measured at the individual level, and will include the self-employed as well as on-farm employment.  It will also count as one person employed even if the individual holds multiple jobs.  The employment figures normally cited (and used by Moody’s) are of non-farm payroll employment, which comes from surveys of establishments, excludes the self-employed and on-farm employment, and counts each job even if one person might hold more than one job (as the establishment will only know who they employ, and will not know if some of their employees might hold second jobs).  But the differences due to these factors are small, and adjustments can be made.]

Thus, for any given set of forecast population figures (by age group, etc.), employment will follow from the labor force participation rate and the assumed rate of frictional unemployment (i.e. unemployment when the economy is assumed to be operating at full employment).  Forecast employment in any future year under different scenarios will therefore only differ if either the labor force participation rate, or the unemployment rate (or both), differ for some reason.

C.  The Moody’s Employment Scenarios for 2030

Moody’s Analytics examined three scenarios for 2030 (and the path to it):  A base case where neither the infrastructure plan of Biden nor the covid-relief plan of Biden existed, a scenario where only the covid-relief plan was in place, and a scenario where both are in place.  In the first (base case) scenario it forecasts that employment in the US would rise to 157.9 million in 2030 from an average of 142.2 million in 2020, or an increase of 15.7 million.  In the scenario with only the covid-relief plan, Moody’s forecasts that employment in 2030 would then total 158.5 million, or 0.6 million more than in the base case.  And in the scenario where the infrastructure plan is also passed and implemented, Moody’s forecasts that employment in 2030 would total 161.2 million, or 2.7 million more than in the scenario with only the covid-relief plan passed and 19.0 million more than average total employment in 2020.

But why would employment levels in 2030 differ at all between these scenarios?  As discussed above, they can only differ if labor force participation rates differ or the assumed unemployment rates in that forecast year differ.  (The basic population numbers for that year should certainly not differ.)  In the Moody’s numbers they both do, but it is not clear why.

It is in particular difficult to understand why Moody’s allowed the assumed unemployment rates in 2030 to differ across their scenarios.  The scenario with just the covid-relief plan, which will be over by 2023 at the latest, should in particular not have an impact on the unemployment rate in 2030.  But in the Moody’s figures it does, albeit by only a minor amount (with unemployment at 4.5% in 2030 in the base scenario, and 4.4% in the scenario with the covid-relief plan).

The difference is larger in the scenario with both the covid-relief plan and the infrastructure plan.  Moody’s forecasts that unemployment in 2030 would then be just 3.8%, or well less than the 4.5% rate in the base scenario.  Why would that be?  While there would still be a small amount of spending under the infrastructure plan in 2030 (Moody’s uses a figure of $81 billion in its scenario), the impact of such spending in that year would be small (just 0.2% of forecast GDP in that year) and would in any case have been diminishing over time as the infrastructure plan was being phased down.  That is, the reductions in spending under the infrastructure plan in the outer years, relative to what they would have been a few years before, would (if not offset by other actions) be deflationary at that point, not expansionary.  But regardless of whether Biden’s infrastructure plan had been passed in 2021 or not, one would assume that fiscal and monetary policy would have sought in that future year (2030) to bring the economy to full employment, at whatever the assumed rate of (frictional) unemployment that it then is. There is no rationale for assuming the rate of unemployment in 2030 will differ across the scenarios.

The other difference in the Moody’s forecasts for 2030 under the different scenarios is in the labor force participation rates.  One can work out from the numbers Moody’s provided in its document (coupled with the BLS numbers for the adult population) that the labor force participation rate would be 58.5% in the base scenario, 58.7% in the scenario where only the Biden covid-relief package was passed, and 59.3% if the Biden infrastructure plan is also passed.  (More precisely, these are the Moody’s figures for non-farm payroll employment as a share of the population, not the overall labor force, with the small differences noted above between those two concepts).  Compared to the scenario of the covid-relief plan only, two-thirds (66%) of the extra 2.7 million in employment in 2030 is due to the higher labor force participation rates Moody’s forecasts for that year, and one-third (34%) is due to its forecast of a lower unemployment rate in that year.

Why should the labor force participation rate be higher in 2030 if Biden’s infrastructure plan is passed?  One could postulate a connection, but it would be tenuous and it is not clear if this was in fact intended by Moody’s or was just an outcome following from other relationships in its model.  I do not know enough about the structure of its model to say.  But one can speculate that the model may have linked the labor force participation rate in a forecast year to real wages in that year, with a higher real wage leading to a higher labor force participation rate.  Furthermore, the model might link greater infrastructure investment (or greater investment generally) to higher productivity, and higher productivity to higher wages.  In that case, the higher investment might lead, by such a route, to a higher labor force participation rate.  But this would require estimation of the responses in a series of steps, each of which might be tenuous.  It is difficult to forecast how much economy-wide productivity might rise as a result of such investment; difficult to forecast how much real wages would rise if productivity rises (real wages have been flat since around 1980, even though overall productivity rose by almost 80%); and difficult to forecast how much a rise in real wages might then raise the labor force participation rate.

But this is conceivable.  Whether it was an intended relationship in the Moody’s model is not so clear.  Such models are large and complicated, with a focus on particular issues.  Certain results might then follow, but those constructing the model might not have paid much attention to such outcomes when constructing the model, as the focus was on something else.

In any case, one has to be careful in interpreting the results as implying there would be 2.7 million additional jobs “created” in 2030 as a consequence of the Biden infrastructure plan.  There would, in the model, be 2.7 million more people employed, but this would mostly be due to a higher proportion of the population seeking employment in that year (a higher labor force participation rate).  And assuming an economy at full employment in that year, the additional number seeking employment would translate into that additional number being employed.  But it would be a stretch to interpret this as the infrastructure plan “creating” those additional jobs.  Rather, a higher share of the population are looking for work (a higher labor force participation rate), and are assumed to be able to find it.

D.  The Jobs Directly Created by the Infrastructure Plan

The Biden infrastructure plan would certainly create a huge number of jobs while the infrastructure is being built.  There would be jobs such as depicted in the photo at the top of this post, and with $2.2 trillion being spent there would be a large number of them (even with a share of the $2.2 trillion being spent in high priority areas outside of what is traditionally considered “hard” infrastructure, such as for labor training and health infrastructure).

These would, however, be jobs for a fixed period.  Once the particular projects are finished, those jobs would end.  Thus one should think of these as being so many person-years of employment (employment of one person for one year).  These are not permanent jobs being “created”, but rather workers being employed for a period of time to build a project or to complete a specific maintenance or repair task (e.g. repaving a road).

While not permanent jobs, it would still be important to have good estimates of how many there would be.  Moody’s did not do that, nor was it their intention, but one needs to be clear about that.  It will be important, however, that there be a serious effort at some point to work out such estimates, and I would guess that someone in government is working on this now.  They are needed precisely because there will be a large number who will be employed on these infrastructure projects, and workers with the necessary skills for such work are limited, in part because the US has so woefully underinvested in its infrastructure in recent decades (as will be discussed in the next section below).  It will thus be important to pay attention to the phasing of the individual projects, both over time and geographically, to ensure there will be sufficient capacity (both in terms of the workers needed and the firms that manage such projects) to build the projects at a given place and at a particular time.  It does not help much that there might be workers with the requisite skill in New York, say, when the need is for a project in California.

This will therefore need to be worked out, and I suspect it will be.  This will also guide what workforce development and training needs there will need to be, and the BLS routinely provides such estimates (at least at a broad, economy-wide, level).  But while it is correct to term jobs (or more precisely person-years of jobs) as being “created” under such an infrastructure plan, this does not necessarily mean that the total number of jobs in the economy will be higher.  If the economy is at full employment (and the labor force participation rate otherwise unchanged), the total number employed in the economy will be unchanged.  It is just that some share of those employed will be working on these infrastructure projects.  And that means fewer will be working in other jobs.

That is not a bad thing.  While the overall number employed will be the same, there will be jobs in the infrastructure projects which will have been attractive enough (either due to higher wages that they pay or for some other reason) to draw workers to those jobs.  Those who shift to those new jobs will then be better off, which is good.  Furthermore, the workers shifting to those new jobs would then have left positions that others may find attractive enough to move into (due to a higher wage, or whatever).  Thus there would be shifts across the economy.  Some less attractive jobs would cease to be filled, with employers forced to learn how to make do with less, but that is how competition works.

It is thus not correct to assert the total number employed in the economy will be higher as a consequence of the infrastructure investment plan (aside from during an initial few years as the economy moves to full employment – and Moody’s forecasts that this will be complete by 2022 with the covid-recovery and infrastructure plans enacted and even by 2024 without them).  The total number employed in such forecasts will be largely the same with or without the plans.  But that does not mean they are not without value to workers.  There will be new jobs to be filled, which will need to be attractive enough to draw workers to them.  And that helps workers.

E.  Public Infrastructure Investment in the US

Public infrastructure in the US is an embarrassment.  And it has a direct impact on productivity.  As was noted before, a truck driver sitting in a traffic jam is not terribly productive.  Similarly, exporters of soybeans who have to wait weeks to ship their product due to inadequate capacity at the ports cannot be terribly competitive in global markets (and will have to accept a price cut in order to sell their product).  And so on.

The major reason public infrastructure in the US is so poor is that the US has simply underinvested in it.  Using a broad definition of all government investment excluding that for the military, as a share of GDP, one has (calculated from BEA NIPA statistics):

Government investment peaked in the mid-1960s (as a share of GDP) and has declined ever since.  In gross terms it has been lower in recent years than in any time since the early 1950s.  Net of depreciation, it has been a good deal lower over the last half-decade (to 2019 – the 2020 figure is not yet available) than it has ever been in the last 70 years at least.  (And note that the blip up in the GDP share in 2020 was not because public investment rose.  The rate of growth of gross government investment in 2020 was in fact less than in 2019 and about the same as in 2018.  Rather it was because GDP collapsed in 2020, in the last year of the Trump administration, which pushed the share higher.)

What is of most interest for the state of public infrastructure is such investment net of depreciation.  That is shown as the curve in red in the chart, and it has fallen from a peak of 3.0% of GDP in 1966 to just 0.7% of GDP in recent years (up to 2019), a fall of 77%.  And at such a pace of adding to the net stock of public capital (infrastructure), the stock of such capital as a share of GDP will be falling.  By simple arithmetic, the ratio will be falling if the stock of that capital as a share of GDP is greater than the net investment share of GDP (0.7% here) divided by the rate of growth of nominal GDP.  Taking a nominal growth rate for GDP of, say, 4% (i.e. a real growth rate of 2% and a growth in prices of 2%), then the stock of public capital as a share of GDP will fall if the current stock of that capital is 17.5% of GDP or more (where 17.5% is equal to 0.7% / 4%).  The stock of public capital will certainly be well more than that in any modern economy, including the US.  And that underinvestment is why our highways are becoming increasingly subject to traffic jams, for example.  Our infrastructure is simply not keeping up.

Major public investment will be needed to reverse this, and the Biden infrastructure plan will be a start.  To put things in perspective, I have taken what would be spent annually under the Biden Plan (as estimated by Moody’s), as a share of GDP, and added this to a base amount where I simply assume other government investment in gross terms will remain at the average share it was between 2013 and 2019 (when it was quite steady at about 2.65% of GDP).  The figures for real GDP used for these calculations were those forecast by Moody’s under the scenario that the Biden infrastructure plan goes ahead, with these converted to nominal GDP (for the shares) using the forecast GDP deflators of the Congressional Budget Office.  Spending under the Biden Plan alone would start at 0.5% of GDP in 2023, rise to a peak of 1.3% of GDP in 2025, and then fall to 0.2% of GDP in 2030 and 0.1% in 2031.  Adding these figures to a base level of 2.65%, one would have:

A $2.2 trillion infrastructure investment plan is certainly large.  But the chart puts this in perspective.  Even with such an investment program, public investment would still not rise to as high as it was in the mid-1960s, nor would it last nearly as long.  Public investment had been relatively high (compared to later periods) from the mid-1950s to around 1980 – almost a quarter-century.  The $2.2 trillion Biden plan would raise public investment, but only for about eight years.  A question that will need to be addressed later is what happens after that.  Reverting to the recent, low, levels of infrastructure investment, would eventually lead back to the problems we have now.

F.  Summary and Conclusions

Politicians will always tout the jobs that will be “created” if their programs are approved.  If they didn’t, they likely would not hold office for long.  President Biden is no exception.  And the administration has cited independent estimates made by Mark Zandi’s team at Moody’s Analytics to say that Biden’s “American Jobs Plan” would indeed create a large number of jobs.  They cite Moody’s estimates that the number of jobs in 2030 would be 19 million higher than in 2020 if the infrastructure plan (as well as the covid-relief plan) are approved, and 2.7 million higher in 2030 if that infrastructure plan is approved as compared to a scenario where it is not.

These are, indeed, the Moody’s numbers.  But one should be careful in the interpretation of what they in fact mean, and Moody’s can be criticized for not being fully clear on this.  These are not jobs, generally in construction, that would follow directly from the infrastructure investment program (which should be counted as person-years of employment in any case, as such jobs are not permanent).  Rather, what Moody’s has done has been to use its model of the US economy to examine what overall employment levels would be in 2030 under the various scenarios.  It found that the number employed would be 2.7 million higher in 2030 (1.7% of forecast employment in that year) in the scenario with the infrastructure plan as compared to a scenario without it.  One can calculate that roughly two-thirds of this would be due to a higher labor force participation rate, and one-third due to a lower unemployment rate in that year.

It is not clear, however, why forecasts of either of those two variables – participation rates and the unemployment rate – should differ at all across the scenarios.  I would not be surprised if these were simply unintended consequences in a complex model.  In any case the differences in employment in that forecast year of 2030 are small, as one would expect.  Furthermore, by 2030 the infrastructure plan would be winding down, with only small residual amounts remaining to be spent.

During the course of the 2020s, however, a very significant number of people will be employed on these infrastructure investments.  They will be employed for limited periods until the projects are completed (and hence should be counted in person-years of employment), but this would still be significant.  It will be important to estimate not just how many will be employed and for what periods, but also what skills will be required and where and when they will be required.  This is probably now being done somewhere in government.  But Moody’s did not attempt to do that.

And while such jobs, mostly in construction, can be correctly termed as “created” under the infrastructure investment plan, this does not necessarily mean the overall number of people employed in the economy will be higher.  Unless labor force participation rates would then be higher for some reason (and it is difficult to see why that would be the case) or the unemployment rate is lower (which it cannot be if the economy is already at full employment), the overall number employed in the economy will be unchanged.  What would happen, rather, would be shifts in the job structure, not in the number of jobs overall.  Some workers would shift into the construction jobs needed to build the infrastructure, and others would shift into the jobs these workers had occupied before.  That is all good – the new jobs will need to be more attractive in terms of pay and/or for other reasons for workers to shift to them – but the total number employed (the total number of “jobs”) would largely be the same.

The public infrastructure is certainly needed.  The US has been underinvesting in its public infrastructure for decades, and when account is taken for depreciation it is clear that the net stock of public capital has not kept up with the overall growth of the economy.  That is why roads, for example, are now so often jammed.  The Biden Plan would bring public investment up to levels not seen for decades, although still not matching (even at $2.2 trillion) the public investment levels of the 1960s as a share of GDP.  It is also a time-limited program, which would phase down in the second half of the 2020s.  At some point, this will need to be addressed.  Bringing public investment levels back down to the far from adequate levels of recent decades will lead to the same problems again.  But that will likely be an issue that will not be seriously considered until the next presidential term.