The Economics of Rocket and Spacecraft Development: What Followed From Obama’s Push for Competition

A.  Introduction

The public letter was scathing, and deliberately so.  Made available to the news media in April 2010 just as President Obama was preparing to deliver a major speech on his administration’s strategy to put the US space program back on track, the letter bluntly asserted that the new approach would be “devastating”.  Signed by former astronauts Neil Armstrong (Commander of Apollo 11, and the first man to walk on the moon), Jim Lovell (Commander of the ill-fated Apollo 13 mission), and Eugene Cernan (Commander of Apollo 17, and up to now the last man to walk on the moon), the letter said that reliance on commercially contracted entities to carry astronauts to orbit “destines our nation to become one of second or even third rate stature”.  The three concluded that under such a strategy, “the USA is far too likely to be on a long downhill slide to mediocrity”.

What was the cause of this dramatic concern?  Upon taking office in January 2009, the Obama administration concluded that a thorough review was needed of NASA’s human spaceflight program.  Year’s earlier, following the breakup of the Columbia Space Shuttle as it tried to return from orbit – with the death of all on board – the Bush administration had decided that the Space Shuttle was not only expensive but also fundamentally unsafe to fly.  Due to its cost while still flying the Shuttle, NASA did not have the funds to develop alternatives.  The Bush administration therefore decided to retire the then remaining Space Shuttles by 2010.  The Obama administration later added two more Space Shuttle flights to allow the completion of the International Space Station (ISS), but the final Space Shuttle flight was in 2011.  The Bush administration plan was that the funds saved by ending the Space Shuttle flights would be used to develop what they named the Constellation program.  Under Constellation, two new space boosters would be developed – Ares I to launch astronauts to the ISS in low earth orbit and Ares V to launch astronauts to the moon and possibly beyond.  A new spacecraft, named Orion, to carry astronauts on these missions would also be developed.

To fund Constellation, the Bush administration plan was also to decommission the ISS in 2015, just five years after it would be completed.  Work on the ISS had begun in 1985 – when Reagan was president -, the first flight to start its assembly was in 1998, and assembly was then expected to be completed in 2010 (in the end it was in 2011).  The total cost (as of 2010) had come to $150 billion.  But in order to fund Constellation, the Bush administration plan was to shut down the ISS just five years later, and then de-orbit it for safety reasons to burn it up in the atmosphere.

The Obama administration convened a high-level panel to review these plans.  Chaired by Norman Augustine, the former CEO of Lockheed Martin (and commonly referred to as the Augustine Commission), the committee issued its report in October 2009.  They concluded that the Constellation program was simply not viable.  Their opening line in the Executive Summary read “The U.S. human spaceflight program appears to be on an unsustainable trajectory.”  Mission plans (including the time frames) were simply unachievable given the available and foreseeable budgets.  There would instead be billions of dollars spent but with the intended goals not achieved for decades, if ever.  A particularly glaring example of the internal inconsistencies and indeed absurdities was that the Aries I rocket, being developed to ferry crew to the ISS, would not see its first flight before 2016 at the earliest.  Yet the ISS would have been decommissioned and de-orbited by then.

The Augustine Commission recommended instead to shift to contracting with private entities to ferry astronauts to orbit.  Such a program for the ferrying of cargo supplies to the ISS had begun during the Bush administration.  By 2009 this program was already well underway, and the first such flight, by SpaceX using its Falcon 9 rocket, was successfully completed in May 2012.  The commission also recommended that work be done to develop the technologies that could be used to determine how a new heavy-lift launch vehicle should best be designed.  For example, would it be possible to refuel vehicles in orbit?  If so, the overall size of the booster could be quite different, as there would no longer be a need to lift both the spacecraft and the fuel to send it on to the Moon or to Mars or to wherever, all on one launch.  And the commission then laid out a series of options for exploration that could be done with a new heavy-lift rocket (whether a new version of the Ares V or something else), including to the Moon, to Mars, to asteroids, and other possibilities.  It also recommended that the life of the ISS be extended at least to 2020.

And it was not just the Augustine Commission expressing these concerns.  Earlier, in a report issued in August 2009, the GAO stated that “NASA is still struggling to develop a solid business case … needed to justify moving the Constellation program forward into the implementation stage”.  It also noted that NASA itself, in an internal review in December 2008 (i.e. before Obama was inaugurated) had “determined that the current Constellation program was high risk and unachievable within the current budget and schedule”.  The GAO also noted that Ares I was facing important technical challenges as well (including from excessive vibration and from its long narrow design, where there was concern this might cause it to drift into the launch tower when taking off).  While it might well be possible to resolve these and other such technical challenges given sufficient extra time and sufficient extra money, it would require that extra time and extra money.

President Obama’s strategy, as he laid out in a speech at the Kennedy Space Center on April 15, 2010 (but which was already reflected in his FY2011 budget proposals that had been released in February), was built on the recommendations of the Augustine Commission.  The proposal that received the most attention was that to end the Ares I program and to contract instead with competing commercial providers to ferry crews to the ISS.  And rather than continue on the Ares V launch vehicle (on which only $95 million had been spent by that point, in contrast to $4.6 billion on Ares I), the proposal was first to spend significant funds (more than $3 billion over five years) to develop and test relevant new technologies (such as in-orbit refueling) to confirm feasibility before designing a new heavy-lift launch vehicle.  That design would then be finalized no later than 2015.  Third, work would continue on the Orion spacecraft, but with a focus on its role to carry astronauts beyond Earth orbit, as well as to serve as a rescue vehicle should one be needed in an emergency for the ISS.  Fourth, the life of the ISS itself would be extended to at least 2020 from the Bush plan to close and destroy it in 2015.  And fifth, Obama proposed that the overall NASA budget be increased by $6 billion over five years over what had earlier been set.

While the proposal was well received by some, there were also those who were vociferously opposed – Armstrong, Lovell, and Cernan, for example, in the letter quoted at the top of this post.  But perhaps the strongest, and most relevant, opposition came from certain members of congress.  Congress would need to approve the new strategy and then back it with funding.  Yet several key members of Congress, with positions on the committees that would need to approve the new plans and budgets, were strongly opposed.  Indeed, this opposition was already being articulated in late 2009 and early 2010 as the direction the Obama administration was taking (following the issuance of the Augustine Commission report) was becoming clear.

Perhaps most prominent in opposition was Senator Richard Shelby of Alabama, who repeatedly spoke disparagingly of the commercial competitors (meaning SpaceX primarily) who would be contracted to ferry astronauts to the ISS.  In a January 29, 2010, statement, for example (released just before the FY2011 budget proposals of the Obama administration were to be issued), Shelby asserted “China, India, and Russia will be putting humans in space while we wait on commercial hobbyists to actually back up their grand promises”.  Shelby called it “a welfare program for amateur rocket companies with little or nothing to show for the taxpayer dollars they have already squandered”.

Shelby was not alone.  Other senators and congressmen were also critical.  Most, although not all, were Republicans, and one might question why those who on other occasions would articulate a strong free-market position, would on this issue argue for what was in essence a socialist approach.  The answer is that under the traditional NASA process, much of the taxpayer funds that would be spent (many billions of dollars) would be spent on federal facilities and on contractors in their states or congressional districts.  The Marshall Space Flight Center in Huntsville, Alabama, was the lead NASA facility for the development of the Ares I and Ares V rockets, and Senator Shelby of Alabama was proud of the NASA money he had directed to be spent there.  Senators and congressmen from other states with the main NASA centers involved or with the major contractors (Texas, Florida, Mississippi, Louisiana, Utah) were also highly critical of the Obama initiative to introduce private competition.

The outcome, as reflected in the NASA Authorization Act of 2010 (passed in October 2010) and then in the FY2011 budget passed in December, was a compromise.  The administration was directed basically to do both.  The legislation required that a new heavy-lift rocket be designed immediately, with the key elements similar to and taken from the Ares V design (and hence employ the same contractors as for the Ares V).  It was eventually named the Space Launch System (or SLS – or as wags sometimes called it “the Senate Launch System” since the key design specifics were spelled out and mandated in the legislation drafted in the Senate).  It also directed that the cost should be no more than $11.5 billion and that it would be in operation no later than the end of 2016.  (As we will discuss below, the SLS has yet to fly and is unlikely to before 2022 at the earliest, and over $32 billion will have been spent on it before it is operational.)

The compromise also allowed the administration to proceed with the development of commercial contracts to ferry astronauts to the ISS, but with just $307 million allocated in FY2011 rather than the $500 million requested.  For FY2011 to FY 2015, only $2,725 million of funding was eventually approved by congress, or well less than half of the $5,800 million originally requested by the Obama administration in 2010 for the program.  As a result, the commercial crew program, as it was called, was delayed by several years.  The first substantial contracts (aside from smaller amounts awarded earlier to various contractors to develop some of the technologies that would be used) were signed only in August 2012.  At that time, $440 million was awarded to SpaceX, $460 million to Boeing, and $212.5 million to Sierra Nevada Corporation, to develop the specifics of their competing proposals to ferry astronauts to the ISS.

The primary contracts were then awarded to SpaceX and to Boeing in September 2014.  NASA agreed to pay SpaceX a fixed total of $2,600 million, and Boeing what was supposed to be a fixed total of $4,200 million (but with an additional $287.2 million added later, when Boeing said they needed more money).  Each of these contracts would cover the full costs of developing a new spacecraft (Crew Dragon for SpaceX and Starliner for Boeing) and then of flying them on the rockets of their choice (Falcon 9 for SpaceX and the Atlas V for Boeing) to the ISS for six operational missions (with an expected crew of four on each, although the capsules could hold up to seven).  The contracts would cover not only the cost of the rockets used, but also the costs of an unmanned test flight to the ISS and then a manned test flight to the ISS with a crew of two or more.  If successful, the six operational missions would then follow.

We now know what has transpired in terms of missions launched.  While the SLS is still to fly on even a first test mission (the current schedule is for no earlier than November 2021, but many expect it will be later), SpaceX successfully carried out an unmanned test of its spacecraft (Crew Dragon) in a launch and docking with the ISS in March 2019, a successful test launch with a crew of two to the ISS in May 2020, an operational launch with a NASA crew of four to the ISS in November 2020, and a second operational launch with a NASA crew of four to the ISS in April 2021 (where I have included under “NASA” crews from space agencies of other nations working with NASA).  The April 2021 mission also reused both a Crew Dragon capsule from an earlier mission (the one used on the two-man test flight in May 2020) and a previously used first stage booster for the Falcon 9 rocket.  Previously, out of caution, NASA would only allow a new Falcon 9 booster to be used on these manned flights – not one that had been flown before.  They have now determined that the reused Falcon 9 boosters are just as safe.

As I write this, the plans are for a third operational flight of the Crew Dragon, again carrying four NASA astronauts to the ISS, in late October or November 2021.  And as I am writing this, SpaceX has just launched (on September 15) a private, all-civilian, crew of four for a three-day flight in earth orbit.  They are scheduled to return on September 18.  And there may be a second such completely commercial flight later in 2021.

Boeing, in contrast, has not yet been successful.  While Boeing was seen as the safe, traditional, contractor (in contrast to the “amateur hobbyists” of SpaceX), and received substantially higher funding than SpaceX did for the same number of missions, its first, unmanned, test launch in December 2019, failed.  The upper stage of the rocket burned for too long due to a software issue, and the spacecraft ended up in the wrong orbit.  While they were still able to bring the spacecraft back to earth, later investigations found that there were a number of additional, possibly catastrophic, software problems.  After a full investigation, NASA called for 61 corrective actions, a number of them serious, to be taken before the spacecraft is flown again.

As I write this, there have been further delays with the Boeing Starliner.  After several earlier delays, a re-run of the unmanned test mission of the capsule was scheduled to fly on July 30, 2021.  However, on July 29, a newly arrived Russian module attached to the ISS began to fire its thrusters due to a software error, causing the ISS to start to spin.  While it was soon brought under control, the decision was made to postpone the flight test of the Boeing Starliner by a few days, to August 3, to allow time for checks to the ISS to make sure there was no serious damage from the Russian module mishap.  But then, in the countdown on August 3 problems were discovered in the Starliner’s control thrusters.  Many of the valves were stuck.  On August 13, the decision was made to take down the capsule from the booster rocket, return it to a nearby facility, confirm the cause of the problem (it appears that Teflon seals failed), and fix it.  There will now be a delay of at least two months, and possibly into 2022.

Thus the unmanned test flight of the Boeing Starliner will only be flown at least two and a half years (and possibly three years, or more) after the successful unmanned test flight of the SpaceX Crew Dragon capsule in March 2019.  And as noted before, Boeing was supposed to be the safe choice of a traditional defense and space contractor, in contrast to the hobbyists at SpaceX.

While flight success is, in the end, the most important and easy to observe metric, also important is how much these alternative approaches cost.  That will be the focus of this post.  The cost differences are huge.  While not always easy to measure (this will be discussed below), the differences in the costs between the traditional NASA contracting and the more commercial contracts that paid for the services delivered are so large that any uncertainty in the cost figures is swamped by the magnitude of the estimated differences.

We will first look at the costs of developing and flying the principal heavy-lift rockets now operational in the US.  While they have different capabilities, which I fully acknowledge, the differences in the costs cannot be attributed just to that.  We will then look at the costs of developing and flying the three capsule spacecraft we now have (or will soon have) in the US:  the SpaceX Crew Dragon, the Boeing Starliner (more properly, the Boeing CST-100 Starliner), and the Orion being built under contract to NASA by Lockheed Martin.  The differences in capabilities here are also significant, but one cannot attribute the huge cost differences just to that.

This blog post is relatively long, with a good deal of discussion on the underlying basis of the estimates for the various figures as well as on the capabilities (and comparability) of the various rockets and spacecraft reviewed.  For those not terribly interested in such aspects of the US space program, the basic message of the post can be seen simply by focussing on the charts.  They are easy to find.  And the message is that NASA contracting on the commercial basis that the Obama administration proposed for the carrying of crew to the ISS (and which the Bush administration had previously initiated for the carrying of cargo to the ISS) has been a tremendous success.  SpaceX is now routinely delivering both cargo and crews to orbit, and at a cost that is a small fraction of what is found with the traditional NASA approach.  One sees this in both the development and operational costs, and the differences are so large that one cannot attribute this simply to differing missions and capabilities.

B.  The Rockets Reviewed and Their History

The chart at the top of this post shows the cost per kilogram to launch a payload to low earth orbit by the primary heavy-lift launch vehicles currently being used (or soon to be used) in the US.  This is only for the cost of an additional rocket launch – what economists call the marginal cost.  The cost to develop the rocket itself is not included here, as that cost is fixed and largely the same whether there is only one launch of the vehicle or many.  We will look at those development costs separately in the discussion below.

To get to the cost per kilogram, one must start with what each rocket is capable of carrying to low earth orbit and then couple this with the (marginal) cost of an additional launch.  We will review all that below.  But first a note on the data underlying these figures.

For a number of reasons, comparable data on the costs and even the maximum lift capacities of these various rockets are not readily available.  One has to use a wide range of sources.  Among the primary ones I used (for both the payload capacities and the costs of the rockets discussed in this and in the following sections, as well as for the costs for the spacecraft discussed further below), one may look here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, and here.

There will, however, be issues with the precision of any such estimates, in particular for the costs.  For a number of reasons, such comparisons (again especially of the costs) are difficult to make.  Several of those reasons are discussed in an annex at the end of this post.  Due to the difficulties in making such comparisons, differences in costs per kilogram of payload lifted to orbit of 10 or 20% certainly, but also even of 40 or 50%, should not be viewed as necessarily significant.  However, we will see that the differences in costs between developing and launching rockets and spacecraft with the traditional NASA approach and the approach based on competition that Obama introduced to manned space flight are far greater than this.  Indeed, we will see that the costs are several times higher, and often even an order of magnitude or more higher.  Differences of such magnitude are certainly significant.

To start, rockets differ in capabilities, and one must adjust for that.  The most important measure is lifting capacity – how many kilograms of payload can be carried into orbit:

The rockets to be examined here are limited to US vehicles (hence none of those from China, Russia, Europe, and elsewhere) and to heavier boosters sizeable enough to carry manned vehicles.  Ares I is included even though it flew on only one test flight (and only a partial one at that) before its development was ended, in order to show how its capacity would have compared to other launchers.  It would be similar in size to the alternatives.  But its (incomplete) development costs were already more than an order of magnitude higher than that of the Falcon 9, as we will discuss below.

The other boosters to be examined here are the Falcon 9 (of which there are two versions – with the first stage booster either expended or recovered), the Atlas V and Delta IV Heavy (both made by the United Launch Alliance – a 50/50 joint venture of Boeing and Lockheed Martin that, when formed in 2006, had a monopoly on heavy launch vehicles in the US), the Space Launch System (still to be tested in its first launch), and the Falcon Heavy (of which there are also two versions, with the first stage boosters either expendable or recoverable).

As noted, the Falcon 9 can be flown in two versions – with the first stage booster either expended (allowed to fall into the ocean) or recovered.  Since the first stage of a rocket will normally be the most expensive part of a rocket, Elon Musk sought to develop a booster where the first stage could be recovered.  And he did.  (He also, for a time, sought to recover similarly the second stage of the Falcon 9, but ultimately abandoned this.  The cost of a second stage is less, so there is less benefit in recovering it, while the difficulty, and hence the cost, is greater.  He eventually concluded it was not worth it.)

The Falcon 9 first stage is recovered by flying it back either to the launch site or to a floating platform in the ocean, where it slows down and lands by re-igniting its engines.  The videos can be spectacular.  But this requires that a portion of the fuel be saved for the landing, and hence the maximum payload that can be carried is less.  However, the cost savings (discussed below) are such that the cost per kilogram to orbit will be lower.

I have not been able to find, however, a precise figure for what the payload penalty will be when the first stage is recovered.  SpaceX may be keeping this confidential.  SpaceX does provide, at its corporate website, a figure for the maximum payload on a Falcon 9.  It is 22,800 kilograms, but this has been interpreted to be what the payload will be when the first stage uses 100% of its fuel to launch the payload into orbit, with that booster then allowed to fall into the ocean.  But a figure for what the maximum payload can be when the booster is recovered is not provided.  The Wikipedia entry for the Falcon 9, for example, only provides a figure for what was a heavy load on an actual launch (with the booster recovered).  This does not mean this would be the maximum possible load.

For the calculations here, I therefore used the payload capacity figures for the Falcon Heavy, taking the ratio between the payload that can be carried in the fully recoverable version to the payload in the expended version.  Elon Musk has indicated that this payload penalty on the Falcon Heavy is about 10%.  Applying this ratio to the Falcon 9 full capacity figure of 22,800 kg, and rounding down to 20,000 kg, should be a reasonable estimate of the maximum payload on the recoverable version of the rocket, and close enough for the purposes here.  The ratios for the Falcon Heavy and the Falcon 9 should be similar, as the first stage of the Falcon Heavy is essentially three first stages of the Falcon 9 strapped together (with the second stage the same in each), and the fuel that would be needed to be saved to allow for the recoveries and landings of the first stage boosters should be similar.

With this configuration where the Falcon Heavy is essentially three Falcon 9 first stage boosters strapped together, SpaceX was able to build an extremely large booster.  It is currently by far the largest operational such vehicle in the US stable, and indeed is currently the largest in the world.  The SLS will be larger if it becomes operational, but it is not at that point yet.  By building on the Falcon 9, the development costs of the Falcon Heavy were relatively modest, although Elon Musk has noted it turned out to be more complicated than they had at first thought it would be.  And with the three boosters that make up the first stage of Falcon Heavy similarly recoverable, one has even more spectacular videos of pairs of the boosters landing together back at the launch site (the third is recovered on a barge in mid-ocean). There is some penalty in the maximum payload weight that can be carried (about 10% as noted above), but the cost savings far exceed this (discussed below), leading to a cost per kilogram of payload that is almost a third less than when these first-stage boosters are not recovered.

The Atlas V and the Delta IV Heavy are both produced by the United Launch Alliance, the 50/50 joint venture of Boeing and Lockheed.  Its creation in 2006 by bringing together into one company the sole two providers in the US at that time of large launch vehicles was questioned by many.  The first launch of the Falcon 9 came only later, in 2010.  But the primary customer was and is the US Department of Defense, and they approved it (and may indeed have encouraged it) as their primary concern was to preserve an assured ability to fly their payloads into orbit rather than the cost of doing so.

The Atlas series of vehicles were brought to the ULA from Lockheed (via a string of corporate mergers – the Atlas rocket was first developed by General Dynamics), and have a long history stretching back to the initial Mercury orbital launches (of Glenn in 1962) and indeed even before that.  The models have of course changed completely over the years, and importantly since the year 2000 with the first launch of the Atlas III model which used Russian-made RD-180 engines.  The RD-180 engines are now being phased out for national security reasons, but the planned follow on rocket (named the Vulcan, or more properly the Vulcan Centaur), has yet to fly.  The Vulcan will use engines made by Blue Origin, and there have been delays in getting those engines delivered for the initial test flights.

There are ten different models of the Atlas V that have flown, and several more were available if a customer was interested.  For the charts here, version #551 of the Atlas V has been used, as it has the heaviest lift capacity of the various versions and has flown at least ten times (as I write this in September 2021).

The Delta rocket also has a long history, with variants dating back to 1960.  it was originally built by Douglas Aircraft, which after a merger became McDonnell-Douglas, which was later acquired by Boeing.  Boeing then brought the Delta to the United Launch Alliance.  There are also several variants of Delta rockets that have been available, but the Delta IV Heavy version will be used in the charts here as it can carry the heaviest payload among them.  Until the first launch of the Falcon Heavy in 2018, the Delta IV Heavy had the greatest lift capacity of any rocket in the US stable.  But as one can see in the chart, the payload capacity of the Falcon Heavy is double that of the Delta IV Heavy.

The Space Launch System (SLS) dates to 2011, when the basic design was announced by NASA.  Key design requirements had been set, however, in congressional legislation drafted originally in the Senate and incorporated into the NASA Authorization Act of 2010 that was signed into law in October 2010.  As was discussed above, NASA was instructed by Congress to develop the SLS and in doing so that it should use the rocket technology that had been developed for the Space Shuttle and which would have been used for the Ares V.  The Space Shuttle technology dates from the 1970s with a first flight of the Shuttle in 1981.  Its main rocket engines (three at the rear of the Orbiter) were the RS-25, which burns liquid hydrogen and oxygen.  The Shuttle also had two solid rocket boosters attached, each of four segments.

The SLS design, following the mandates of Congress, uses four RS-25 engines in its core first stage.   Two solid rocket boosters, of the same type also as used for the Shuttle, are attached on the sides (although with five segments each rather than the four for the Shuttle).  The second stage of the SLS will use the RL-10 liquid-fueled engine – a design that dates to the 1950s and first flew in 1959.  Indeed, for the initial (Block 1) model of the SLS, the second stage is in essence the second stage that has been used for some time on the Delta III and Delta IV boosters.

The SLS design shares many elements with the Ares V booster that would have been part of the Constellation program begun by the Bush administration.  The first stage booster of the Ares V would have had five RS-25 rockets in its core (versus four of the RS-25s in the SLS) and also with two of the strap-on solid rocket boosters from the Shuttle (but with 5.5 segments each instead of the five on the SLS).  While work on the Ares V never progressed far beyond its design, with NASA spending only $95 million on it before it was canceled, the SLS is very much based on the design of the Ares V and with similar ties to the Space Shuttle.

The engine technologies have of course evolved substantially over time, with upgrades and refinements as more was learned.  And using existing designs should certainly have saved both time and money.  But neither happened.  Congress directed that the SLS should be operational by 2016, and early NASA plans were for it to be flying by 2017, but as of this writing it has yet to have had even a test flight.  As noted previously, the first test launch is currently scheduled for November 2021, but many expect this will be further delayed.  And as we will see below, despite the use of previously developed technology for most of the key components (in particular the rocket engines), the costs have been quite literally astronomical.

Finally, the Ares I booster is included here for comparison purposes.  Its first stage would have been the same solid rocket booster used for the Space Shuttle (but just one booster rather than two, and of five segments), while the second stage would have used a version of the J-2 liquid-fueled engine.  This engine was originally developed in the early 1960s for use in the upper stages of the Saturn 1B and Saturn V rockets then being developed for the Apollo program.  There have been numerous upgrades since, of course, and some would say the J-2 version developed for the Ares I (named the J-2X) was close to a new design.

There was only one, partial, test flight of the Ares I before the program was canceled.  That flight, in October 2009, was of the first stage only (the solid rocket booster derived from the Space Shuttle), with just dummies of the second stage and payload to simulate the flight dynamics of that profile.  It reached a height (as planned) of less than 30 miles.  While deemed a “success” by NASA, the launch caused substantial unanticipated damage to the launch pad, plus the parachutes designed to return the first stage partially failed.

As will be discussed below, while the Ares I never became operational, the amount spent on its (partial) development had already far exceeded that of comparable rockets.  It was also facing substantial technical issues that could be catastrophic unless solved (including from excessive vibration and a concern that with its tall, thin, design it might drift into the launch tower on lift-off).  Finally, as noted before, the rocket’s mission would be to ferry astronauts to the ISS, yet under the Bush administration’s plan to abandon and de-orbit the ISS by 2015 (in order to free up NASA funds for the Constellation program), the first operational flight (as forecast in 2009) would not be until 2016.  Nonetheless, Obama’s decision to cancel the program was severely criticized.

C.  The Cost of Developing the Rockets

In considering the costs of any vehicle, including rockets and spacecraft, one should distinguish between the cost of developing the technology and the cost of using it.  Development costs are upfront and fixed, regardless of whether one then uses the rocket for one launch or many.  Operational costs per launch are then a measure of what it would cost for an additional launch – what economists call the marginal cost.

While the concepts are clear, the distinction can be difficult to estimate.  The costs may often be mixed, and one must then try to separate out what the costs of just the launches were from the cost of developing the system.  But reasonable estimates are in general possible.

To start with development costs:

First, in the case of Ares I all the costs incurred were development costs as there were no operational launches.  Figures on this are provided in the NASA budget documents for each fiscal year.  A total of $4.6 billion was spent on the program between fiscal years 2005 (when the program was launched) and 2010 (when it was canceled).

But at that point the program was far from operational.  The first operational flight was not going to be before 2016 at the earliest, and very likely later.  To make the comparison similar to the costs of other rocket programs (which have reached operational status), one should add an estimate of what the additional costs would have been to reach that same point.  But there is only a partial estimate of what those additional development costs would have been.  As is standard, the FY2010 NASA budget had five-year cost forecasts (i.e. for the next four years following the request for the upcoming fiscal year) for each of the budget line items, and at that point the forecast was that the Ares I program would cost an additional $8.1 billion in fiscal years 2011 through 2014.  Furthermore, this expected expense of over $2 billion per year would not be declining over time but in fact rising a bit, and would likely continue for several years more at a similar rate or higher until Ares I was operational.

Even leaving out what the additional development costs would have been beyond FY2014 (probably an additional $2 billion per year for several more years), the expected costs through FY2014 would have already been huge, at $12.7 billion.  This is incredibly high for what should have been a relatively simple rocket (based on components that were already well used), although we will see similarly high costs in the development of the SLS.  Why they are so high is difficult to understand, particularly as the Ares I is a booster whose first stage is simply one of the solid rocket boosters from the Shuttle program (and indeed initially physically taken from the excess stock of such boosters left over at the end of the Shuttle program, although modified with an extra segment added in the middle to bring it to five segments from four).  And the second stage was to be built around an upgraded model of the old J-2 engine.

In sharp contrast to these costs for the Ares I, the development costs of the similarly sized Falcon 9 rocket as well as the far larger Falcon Heavy are tiny, at just $300 million and $500 million respectively.  Are those figures plausible?  Since SpaceX is not a publicly listed company, its financial statements are not published.  However, it does have funders (both banks and others providing loans, as well as those taking a private equity position) so financial information is made available to them.  While confidential, it often leaks out.  Plus there are public statements that Elon Musk and others have made.  And importantly, as a start-up founded in 2002, it was a small company without access to much in the way of funding in the period.  They could not have spent billions.

One should acknowledge, however, as Elon Musk repeatedly has, that NASA provided financial assistance at a critical point.  SpaceX, Tesla, and Elon Musk personally, were all running low on cash in 2006, were burning through it quickly, and would soon be out of funds.  Then, in late 2006, NASA awarded SpaceX a $278 million contract under its new COTS (Commercial Orbital Transportation Services) program, to be disbursed as identified milestones were reached.  SpaceX was among more than 20 competitors for funds under this program, with SpaceX and one other (Orbital Sciences, with its own launch vehicle and spacecraft) winning NASA support.  The funding to SpaceX was later raised to $396 million (with additional milestones added) and was used to support the development of the Falcon 9 rocket, of the original version of the Dragon capsule to ferry cargo to the ISS, and the cost then to fly three demonstration flights (later collapsed to two) showing that the systems worked.  The second (and final) demonstration mission was a fully successful launch in May 2012 of the Falcon 9 carrying the Dragon capsule with cargo for the ISS, which successfully docked with the ISS and later returned to earth.  Following this, NASA has contracted with SpaceX for a series of cargo resupply missions to the ISS under follow-on contracts (under CRS, for Commercial Resupply Services) where it is paid for each successful mission.  As of this writing, SpaceX is now at the 23rd flight under this program.

NASA funds were important.  But they were only partial and not large, at less than $400 million to support the development of the Falcon 9, the Dragon capsule for cargo, and the initial demonstration flights.  They are consistent with a cost of developing the Falcon 9 alone of about $300 million.

The specific figure of $300 million to develop the Falcon 9 comes from a statement Elon Musk made in May 2011 on SpaceX’s history to that point.  He wrote that total SpaceX expenditures up to that point had been “less than $800 million”, with “just over $300 million” for the development of the Falcon 9.  The rest was for the development of the Dragon spacecraft (used to deliver cargo to the ISS) for $300 million, the cost of developing and testing in five flights SpaceX’s initial rocket the Falcon 1 (which had a single Merlin engine newly developed by SpaceX – the Falcon 9 uses nine Merlin engines), the costs of building launch sites for the Falcon rockets at Cape Canaveral, Vandenberg, and Kwajalein in the Pacific, as well as the cost of building all the corporate manufacturing facilities for the Falcon rockets and the Dragon.  Musk noted that the financial accounts are confirmed by external auditors, as they would be for any sizeable firm.

Separately, in 2017 a Senior Vice President of SpaceX (Tim Hughes), in testimony to Congress, noted that the development cost of Falcon 9 had been $300 million and $90 million for the earlier Falcon 1 rocket, and that NASA had independently verified these figures (in the report here, as updated).

The $300 million cost estimate looks plausible.  Unlike NASA (as well as firms such as Boeing), as a new start-up SpaceX simply would not and did not have the funding to spend much more.  But even if it were several times this, it would still be far less than what the cost of the similarly sized Ares I had been.

The estimate of $500 million to develop the Falcon Heavy also comes from statements made by Musk.  It is also plausible.  As noted above, the Falcon Heavy is basically a set of three Falcon 9 first-stage boosters strapped together, topped by a second stage (as well as payload fairing) that is the same as that on the Falcon 9.  Musk has noted that it was not as easy to do develop the Falcon Heavy as they had initially expected (there are many complications, including the new aerodynamics of such a design), but even at $500 million the cost is a bargain compared to what NASA has spent to develop boosters.

The Space Launch System (SLS) has yet to fly.  As noted before, this will take place no earlier than November 2021, but many expect there will be further delays.  Furthermore, the plan is for only one test flight to be made.  It is not clear what will happen if this test flight is not successful.

One has in NASA budget documents how much has been spent each year for the SLS thus far, and what is anticipated will be required for the next several years.  A total of $26.3 billion will have been spent through FY2021 (i.e. to September 30, 2021).  But the SLS is not yet operational, and the NASA budgets do not provide a breakdown between the cost of developing the SLS and the cost of launching it.  And there is not a clear distinction between the two.  Indeed, even the initial test flight has been labeled the Artemis 1 mission.  It will not be manned, but it will carry the Orion spacecraft (also being tested) on a month-long flight that will take it to the moon, go into lunar orbit, and then leave lunar orbit to return to earth with a splashdown and recovery of the Orion.

If successful, the second launch of the SLS will not be until September 2023 at the earliest.  While this flight would be manned and would loop around the Moon, some, at least, consider it also a test flight – testing all the systems under the conditions of a crew on board.

In part this is semantics, but treating the period until the end of FY2023 as the SLS still in the development phase, the total NASA is expected to have spent developing the SLS will be $32.4 billion.  While its payload capacity is 50% larger than that of the Falcon Heavy, it would have cost 65 times as much to bring it to the point of being operational.  While there are of course important differences, it is difficult to understand why the development of the SLS will have cost 65 times, and possibly more, than the cost of developing the Falcon Heavy.  It is especially difficult to understand as the rocket engines (the main cost for a booster) of the SLS are models used on the Space Shuttle, the strap-on solid rockets are also from the Space Shuttle, and the RL10 engine used on the second stage is derived from that used on earlier US rockets, dating all the way back to the 1950s.

D.  The Cost of Launching the Rockets

Once developed, there is a cost for each launch.  One wants to know the pure marginal cost of an additional launch, excluding all of the development costs, as those costs are in the past and will be the same regardless of what is now done with the newly developed rocket (economists refer to those past costs as sunk costs).

In practice the costs can be difficult to separate.  For private, commercial, vehicles, there may be some public information on what the firm providing the launch services is charging, but the price being charged for any specific flight is often treated as private and confidential, where the agreed upon price was reached through a negotiations process.  And the price paid will presumably include some margin above the pure marginal costs to help cover (when summed across all the launches that will be done) the original cost of developing the rockets plus some amount for profits.  It is even more difficult to determine for the SLS, as one only has what is published in the NASA budget documents for the amount being spent on the overall SLS program, where that total combines the cost of both developing and then launching the vehicle.  NASA has not provided a break-down, and deliberately so.  But one does have in the budget numbers a year-by-year breakdown, which one can use as the development costs (for the initial version of the SLS) will largely be incurred before the vehicle becomes operational, and the operational costs after.  This will be used below.

Even with such provisoes, reasonable estimates of the costs are so hugely different that the basic message is clear:

SpaceX is most transparent on its costs.  Standard prices are given on its corporate website, of $62 million for a Falcon 9 launch and $90 million for a Falcon Heavy.  The site does not specify whether these are for the expendable or recoverable versions, but based on other information, it appears that the $62 million for the Falcon 9 reflects the cost of an expendable Falcon 9, while the $90 million for the Falcon Heavy is for the recoverable version.  The $62 million for the Falcon 9 is similar to what was charged in the early years for the Falcon 9 before the ability to recover its first stage booster was developed.  And Elon Musk has said that the cost of the fully expendable version of the Falcon Heavy maxes out at $150 million, which implies that the $90 million figure shown on its website is for the version where all three of the first stage boosters are recovered.

The $35 million figure for the cost of the Falcon 9 when its first stage is recovered is then an estimate based on a $62 million cost which is assumed to apply when the first stage cannot be recovered.  In an interview in 2018, Musk said that the cost of the first stage booster is about 60% of the cost of a Falcon 9 launch, with 20% for the second stage, 10% for the payload fairing, and 10% for the operations of the launch itself.  These are clearly rounded numbers, but based on them, 60% of $62 million is $37 million, with the remaining 40% then $25 million.  Assuming, generously, that the cost to refurbish the booster for a new flight, plus some amortization cost (e.g. $3.7 million per flight if it can be reused for 10 flights), would be $10 million, then a cost per flight with recovered first stage boosters would be about $35 million per flight.  This is broadly consistent with a statement made by Christopher Couluris (director of vehicle integration at SpaceX) in 2020 that SpaceX can bring down the cost per flight to “below $30 million per launch”, and that “[The rocket] costs $28 million to launch it, that’s with everything”.  The $35 million figure for the recoverable version of the Falcon 9 might well be on the high side, but as was noted previously, I am deliberately erring on the high side for the cost estimates of SpaceX and on the low side for the NASA vehicles.

Thus a figure of $35 million per launch of a Falcon 9 with the first stage booster recovered is a reasonable (and likely high) figure for what the cost is to SpaceX for such a launch.  The $62 million “list price” on the SpaceX website would then include what would be a generous (in relative terms) profit margin for SpaceX, covering the development costs and more.  According to the SpaceX website, as I write this there have been 125 launches of the Falcon 9 since its first flight in 2010, on 85 of these they have recovered the first stage booster, and on 67 flights they have reflown a recovered booster.  The first successful recovery of a first stage booster was in December 2015.

Competition matters, and following the more transparent prices being charged customers by SpaceX for the Falcon 9 and Falcon Heavy (at least transparent in terms of “list prices”), ULA in December 2016 set up a website called “RocketBuilder.com” where anyone can work out which model of the Atlas V they will need.  There are ten models available, carrying payloads to low earth orbit from a low of 9,800 kg for the Atlas V model 401, up to 18,850 kg for the Atlas V model 551.  As noted before, we are examining the model 551 here as its payload is closest to what the Falcon 9 can carry (22,800 kg in the expendable version and 20,000 kg in the recoverable version).  The RocketBuilder.com website was “launched” with substantial publicity on December 1, 2016, accompanied by an announcement of substantial cuts in their prices for the Atlas V.  The CEO of ULA, Tony Bruno, announced that prices for the Atlas V model 401 would start at $109 million – down from $191 million before.  The price of the Atlas V model 551 would be $179 million when combined with a “full spectrum” of additional ULA services.

When set up, the RocketBuilder.com website included, importantly, what the list price would be of the Atlas V rocket model chosen.  Unfortunately, the RocketBuilder.com website as currently posted does not show this.  The reason might be that the CEO of ULA recently announced, on August 26, that ULA will take no more orders for flights of the Atlas V.  The Atlas V uses the Russian-made RD-180 rocket engine (two for each booster), and for national security reasons ULA has been required to cease purchasing these engines.  It must instead develop a new booster with key components all made in the US.  The RD-180 is an excellent engine technically, and is also both highly reliable and relatively inexpensive.  The decision to purchase it, from Russia, was made in the 1990s, and its first flight (on an Atlas III booster) was made in May 2000.  But political conditions have changed, and the most important client for ULA is the US Defense Department.

ULA has now received its final shipment of six RD-180 engines from Russia, and there will be a further 29 Atlas V flights (of all models, not just the model 551) up to the mid-2020s, using up the stock of RD-180 engines ULA has accumulated.  They have now all been booked.  ULA now hopes to launch next year, in 2022, its first test flight of the rocket it has been developing to replace both the Atlas V series as well as the Delta IV Heavy, which it has named the Vulcan Centaur.  It will use the new BE4 engines being developed by Blue Origin.  But that first test flight has been repeatedly delayed.  The first test flight was originally planned for 2019.

However, while the current RocketBuilder.com website of ULA no longer shows the cost to a customer of a launch of an Atlas V, one can find the former prices at an archived version of the RocektBuilder.com website.  While these are prices from a few years ago, they do not appear to have changed (at least as list prices).

The selection is much like that of finding the list price of a new car by going to the manufacturer’s website, selecting the model, adding various options, and then additional services one might want.  For the Atlas V, one can choose various levels of services, from a “Core” option to “Signature”, “Signature pro”, “Full Spectrum”, and “other customization”.  These appear to relate mostly to the division of responsibilities between ULA and the customer on various aspects of integrating the payload with the rocket.  ULA also offers two service packages it calls “Mission Insight” (things such as special access to ULA facilities) and “Rocket Marketing” (pre-launch events, press materials, videos, even “mission apparel”) that provide different levels of services and access.  It is sort of like the higher levels of benefits granted by airlines to their frequent flyers, although here they charge an explicit price for the package.

On the archived website, selecting the payload capacity and orbit that will lead to an Atlas V model 551 being required, the base cost (in 2016) shows as $153 million (as I write this in September 2021).  However, with a “Signature” level of service (which might be the base level required, as the “Core” option is not being allowed for some reason), the cost will be $163 million.  And $173 million for the “Full Spectrum” package.

The website also prominently displays a line for “ULA Added Value” which is then subtracted from that cost.  This does not reflect an actual price reduction by ULA, but rather savings that ULA claims the customer will benefit from if they choose an Atlas V launch by ULA.  The base (default) value of these savings that ULA claims the customer will benefit from is $65 million.  A breakdown shows this is made up of a claimed reduction in insurance costs of $12 million (what it otherwise would have been is not shown – just the “savings”), $23 million because ULA claims they will launch when it is scheduled to be launched and not several months later (which is more than a bit odd – one could produce whatever “savings” one wants by assuming some degree of delay in launch otherwise), and $30 million for what they call “orbit optimization”, which is a claim that the orbit they will place it in will lead to a lifetime for the satellite that is 17 rather than 15 years.

With such “savings” of $65 million (in a base case), ULA claims the actual cost for an Atlas V model 551 launch would not be $163 million, say (in the case considered above), but $65 million less, and thus only $98 million.  While still more than 50% higher, this brings it closer to the Falcon 9 list price of $62 million.  But this all looks like a marketing ploy – indeed rather like a juvenile charade – as that number depends on supposed savings from hypothetical levels.  The amount paid to ULA would still be the $163 million in this example.  And Elon Musk, among others, have questioned the assumptions.

The commonly cited $165 million cost of a launch of an Atlas V is therefore a reasonable estimate.  One should, however, keep in mind that this is both a “list price” subject to negotiation and that depending on the specific options chosen, the price could easily vary by $10 or $20 million around this.  The “savings” figures of ULA should not be taken too seriously, however.  There will be specific factors affecting costs and possible savings with any given payload, for other rockets as well as the Atlas V, and comparisons to some hypothetical will depend on whatever is chosen for that hypothetical.

ULA has provided less public material on the cost of a Delta IV Heavy launch.  This is in part as all of the customers, since the initial test flight in 2004, have been US government entities, and in particular the US Department of Defense.  There have only been 12 launches since that initial test flight, with ten of these classified missions for the Defense Department and two for NASA.  Furthermore, only three more are planned (two in 2022 and one in 2023), with ULA offering the planned Vulcan Centaur rocket (of which there will be a series of models that can carry progressively larger payloads, like for the Atlas V) as a substitute that can carry payloads of a similar size.

Both the Defense Department (especially) and NASA are less than fully transparent on what they have paid for these Delta IV Heavy launches.  The specific costs of the launches can be buried in the broader costs of the overall programs.  But the figures cited for a Delta IV heavy launch have typically been either $400 million (in a statement by ULA in 2015) or $350 million more recently.  It may well have been that, under pressure from the far lower costs of SpaceX, ULA has reduced its price over time.  For the purposes here, and erring on the side of being generous to ULA, I have used for the calculations a price of $350 million for a launch of an additional Delta IV Heavy.

The cost of an additional SLS launch is an estimated $2 billion, but there are conceptual as well as other issues with this figure.  First of all, NASA refused to release to Congress, nor to anyone else for that matter, what the cost of an additional launch would be.  Rather, one only had a single line item in the budget for the combined year-by-year cost of developing and testing the SLS, and also for building and then flying it.  That cost reached $3.1 billion in FY2020, $3.1 billion also in FY2021 and again in FY2022, and with it then forecast to decline slowly but remain at $2.8 billion in FY2026.  The SLS has not yet flown, and its first (uncrewed and the only planned) test flight is now scheduled for November 2021.  The first operational flight (with a crew of four) would not be until 2023 at the earliest, with the second in 2024 at the earliest.  The NASA plan is that there would then be one flight per year starting in 2026 and continuing on into the indefinite future.

But an estimate of the cost of an additional launch of the SLS leaked out, possibly due to an oversight but possibly not, in a letter sent to Congress in October 2019 from OMB. The letter addressed a range of budget issues for all agencies of the government, and set out the position of OMB and hence the administration on matters then being debated.  One was on use of the SLS.  Senator Richard Shelby of Alabama, who was then chair of the Senate Committee on Appropriations, had included in the language of the draft budget bill a requirement that NASA use the SLS for the launch of the planned NASA Europa Clipper mission (a satellite to Europa, a moon of Jupiter).  In a paragraph on page 7 of the letter, OMB recommended against this, as there is “an estimated cost of over $2 billion per launch for the SLS once development is complete”.  The letter noted that a commercial launch vehicle could be used instead for a far lower cost.

NASA later admitted (or at least would not deny) that this would be a reasonable estimate of the additional cost of such a launch.  And it is consistent with the budget forecasts that the SLS program would continue to require funding of close to $3 billion each year once flights had begun (at a pace of one per year, or less).  While the $3 billion is still greater than a figure of $2 billion per flight, the development costs for the SLS program will not end when the first SLS booster is operational.  The initial SLS, while a sizeable rocket, would still not have the lifting capacity that would be needed (under current NASA plans) for the planned lunar landings following the very first.

Specifically, the initial model of the SLS (scheduled to be tested this November 2021) is labeled the Block 1, and has a lifting capacity of 95,000 kg to low earth orbit.  The figures for the Block 1 are the ones that are being used in the charts in this post.  However, its capacity would only be sufficient for the first three flights (including the test flight), where the third flight would support the first landing of a crew on the moon under the Artemis program.  Following that, a higher capacity model, labeled the SLS Block 1B, would need to be developed, with a lifting capacity to low earth orbit of 105,000 kg.  To achieve that, a new second stage would be developed using four of the RS-10 rocket engines (versus a second stage with just a single RS-10 engine in the Block 1 version).

Under current NASA plans the Block 1B version of the SLS would then be used only for four flights.  For missions after that an even heavier lift version of the SLS would be needed, with two, more powerful, solid rocket boosters strapped on to the first stage (instead of the solid rocket boosters derived from those used on the Space Shuttle).  These would increase the lifting capacity to 130,000 kg to low earth orbit.  Part of the reason for developing the Block 2, with the new solid rocket side boosters, is that NASA will have used up by then its excess inventory of solid rocket booster segments (from the Space Shuttle program) for the planned launches of the Block 1 and Block 1B versions of the SLS (with one set in reserve).  Using up the existing inventory makes sense.  It should save money – although those savings are difficult to see given the expense of this program.  But that inventory is limited and will suffice only for up to eight flights of the SLS.  Hence the need for a replacement following that, which led to the design for the Block 2.

There will be development costs for the new second stage (with four RS-10 engines rather than one) for the Block 1B and then for the new, more powerful, strap-on solid rocket boosters for the Block 2.  What share of the approximately $3 billion that would be spent each year for the development of these new models of the SLS has not been broken out in the NASA budget – at least not in what has been made public.  But given that only very limited work has been done thus far on the new second stage for the Block 1B and even less on the new solid rocket boosters to be used for the Block 2, continuing development costs of $1 billion per year looks plausible.

At $2 billion per flight, the cost of a SLS launch is huge.  And this does not include any amortization to cover the development costs.  As noted above, those costs are expected to reach over $32 billion by FY2023.  The costs per launch for the other rockets shown on the chart, including the Falcon Heavy, will include in the prices charged some margin to cover the original development costs.  Commercial companies must do this to recover the costs of their investments.  That amount would be gigantic if added for the SLS in order to make its cost figure more comparable to that of the alternatives.

The question is how many flights of the SLS there will be before a more cost-effective alternative starts to be used.  Note that the alternative need not be limited to another giant rocket with a similar lift capacity.  The SLS itself will not be large enough to carry in a single launch all that will be required on the Artemis missions to the moon.  Rather, there would be separate launches on a range of boosters to carry what would be required.  Indeed, a NASA plan developed in 2019 for the launches that will be necessary through to 2028 as part of the Artemis missions to the Moon envisaged 37 separate launches, of which only 8 would be of the SLS (including its initial test launch).  One can break up the cargoes in many different ways.

While speculative, and really only for the point of illustration, one might assume that there will be perhaps 10 flights of the SLS before more cost-effective alternatives are pursued.  If so, then to cover the over $32 billion development cost one would need to add over $3 billion per flight to make the figures comparable to the costs of the other, commercial, launchers.  That is, the cost would then be over $5 billion per flight for the SLS rather than $2 billion.  This would, however, now be more of an average cost per flight than a true marginal cost, and speculative as we do not know how many flights of the SLS there will ultimately be (other than that it will not likely be many, given its huge cost).  Hence I have kept to the $2 billion figure, which is already plenty high.

Even at $2 billion per flight for the SLS, the cost is over 13 times the cost of a Falcon Heavy (in the version where the boosters are all thrown into the ocean rather than recovered).  The lift capacity of the SLS is 50% more, but it is difficult to imagine that that extra capacity could only be achieved at a cost (even ignoring the huge development cost) that is more than 13 times as much.

What has happened on the Europa Clipper mission provides a useful lesson.  Following a review and consultations with Congress, the Biden administration on July 23, 2021, announced that the Europa Clipper would be flown on a Falcon Heavy instead of the SLS.  The total contract amount with SpaceX for all the launch services is just $178 million (which will include the special costs of this unique mission).  There were several reasons to make the change, in addition to the savings from a cost of $178 million rather than $2 billion.  One is simply that by using the Falcon Heavy they will be able to launch in October 2024.  No SLS will be available by that time, nor indeed for several years after.  While a more direct route to Jupiter would have been possible with the heavier lift capacity of the SLS, the Europa Clipper would have had to be kept in storage for several years until a SLS rocket became available.  Separately, NASA discovered there would be a severe vibration issue due to the solid rocket boosters on the SLS, which the delicate spacecraft would have not have been able to handle.  To modify the Europa Clipper to make it able to handle those vibrations would have cost an additional $1 billion.

Finally, it is clear that the politics has changed.  Senator Shelby of Alabama has been the figure most insistent on requiring use of the SLS to launch the Europa Clipper.  With the NASA Marshall Space Flight Center (located in Huntsville, Alabama) the lead NASA office responsible for the SLS, a significant share of what is being spent on the SLS is being spent in Alabama.  And as Chair of the Senate Committee on Appropriations, Senator Shelby was in a powerful position to determine what the NASA budget would be.  But as a Republican, Senator Shelby lost the chairmanship when the Senate came under Democratic control in January 2021, plus he has announced he will not run for re-election in 2022. His influence now is thus not what it was before, and NASA can now pursue a more rational course on the launch vehicle.

E.  The Cost of Developing and Operating Spacecraft for Crews

NASA has also used its new, more commercial, contracting approach for the development and then use of private spacecraft to carry crew to the ISS.  This was indeed the proposal of President Obama in 2010 that was so harshly criticized, as discussed at the beginning of this post.  We now know how that has worked out:  SpaceX is flying crews to the ISS routinely, while Boeing, a traditional aerospace giant which was supposed to be the safe choice, has had issues.  We also can compare the costs under this program (for both SpaceX and Boeing) to that of developing the Orion spacecraft, where Lockheed Martin is the prime contractor operating under the more traditional NASA contracting approach.

There are, of course, important differences between the Orion and the spacecraft developed by SpaceX (its Crew Dragon, sometimes referred to as the Crew variant of the Dragon 2 as the capsule is a model derived from the original Dragon capsule used for ferrying cargo to the ISS), and by Boeing (which it calls the CST-100 Starliner, or just Starliner for short).  The SpaceX Crew Dragon and the Boeing Starliner will both be used to ferry astronauts to the ISS in low earth orbit, while the Orion is designed to carry astronauts to the Moon and possibly beyond.

But there are important similarities.  They are all capsules, use heat shields for re-entry, and can seat up to six astronauts (Orion) or seven (Crew Dragon and Starliner), even though NASA plans so far have always been for flights of just four astronauts each time.  They are all, in principle, reusable spacecraft. The interior volume (habitable space for the astronauts) is 9 cubic meters on Orion, 9.3 cubic meters on Crew Dragon, and 11 cubic meters on Starliner.

Orion will also be launched with a Service Model attached, which is being built by Airbus under contract to the European Space Agency.  This Service Module will have the fuel and engines required to help send Orion from earth orbit to the moon, and then fully into lunar orbit and back, as well as power (from solar panels) and supplies of certain consumable items required for longer space flight durations.  With this, Orion will be able to undertake missions of up to 21 days.  The self-contained Crew Dragon can carry out missions of up to 10 days, while the Starliner has the capacity of just 2 1/2 days – providing time to reach the ISS and later return, but not much else.

The cost of developing and building the European Service Module for Orion is being covered by the European Space Agency as its contribution to the program.  For better comparability to the Crew Dragon and Starliner spacecraft, the costs of the Service Module for Orion have been excluded from the cost of Orion in the charts below, as it is primarily the Service Module that will give the Orion the capabilities to go beyond earth orbit – capabilities that the Crew Dragon and Starliner do not have.  Had the costs of the Service Module been included (as the Orion is, after all, dependant on it), the disparity in costs between it and the Crew Dragon or Starliner would have been even larger.

The development of Orion began in 2004, as part of the Constellation program of the Bush administration, and has continued ever since.  NASA spent $1.4 billion on it in FY2020, again in FY2021, and the budget proposal is to do so again in FY2022.  Aside from an uncrewed flight in 2014 that was principally to test its heat shield design, the Orion has yet to fly.  Its first real test, still unmanned, will be as part of the first test flight of the SLS, which as noted above is now scheduled for November 2021.

SpaceX and Boeing were awarded the new form of competitive contracts by NASA to build their new spacecraft, demonstrate that they work (with a successful unmanned test flight to the ISS and then a manned flight test), and then fly them on six regular missions carrying NASA astronauts to the ISS.  The designs were by the companies – NASA was only interested in safe and successful flights ferrying crew to the ISS.

Each contractor could use whatever booster they preferred (SpaceX chose the Falcon 9 and Boeing the Atlas V), with the costs of those rocket launches included in the contracts.  The contract awards were announced in September 2014, several years later than the Obama administration had initially proposed due to lack of congressional funding.  The original contracts provided awards of up to $4.2 billion to Boeing and $2.6 billion to SpaceX, a discrepancy that reflected not that SpaceX would provide a lesser service, but rather that SpaceX offered in their contract bid a lower price.  Boeing was later granted an extra $287.2 million by NASA, in a decision that was criticized by the Office of the Inspector General of NASA, as Boeing (as well as SpaceX) had committed to provide the services agreed to under the contracts for the fixed, agreed upon, price.  Any cost overruns should then have been the responsibility of the contractor.  While Boeing argued it was not really a cost overrun under their contract, others (including the NASA Inspector General) disagreed.

Before the main contracts under the program had been approved, Boeing and SpaceX (along with others) had received smaller contracts to develop their proposals as well as to develop certain technologies that would be needed.  Including those earlier contracts (as well as the extra $287.2 million for Boeing), the total NASA would pay (provided milestones are reached) is $5,108.0 million to Boeing and $3,144.6 million to SpaceX.  For this, each contract provided that the new spacecraft would be developed and tested, with this then followed by six crewed flights of each to the ISS.  Thus the contracts include a combination of development and operational costs, which will be separated in the discussion below.

First, the development costs:

The estimates of the development costs for the SpaceX Crew Dragon and Boeing Starliner were made by subtracting, from the overall program costs, estimates made in the November 2019 report of the NASA Inspector General’s Office of the costs of the operational (flight) portions of the contracts.  Included in the development costs are the costs of the earlier contracts with SpaceX and Boeing to develop their proposals, as well as the extra $287.1 million that was later provided to Boeing.  Based on this, the total cost (to NASA) of supporting the development of the SpaceX Crew Dragon has been $1.845 billion, while the cost to NASA of the Boeing Starliner (assuming Boeing is ultimately successful in getting it to work properly) will be a bit over $3.0 billion.

The costs assume that the contractors will carry out their contractual commitments in full.  SpaceX so far has (the Crew Dragon is fully operational, and indeed SpaceX is now in its second operational flight, with a crew of four now at the ISS who are scheduled to return in November).  But Boeing has not.  As noted before, its initial unmanned test flight in December 2019 of the Boeing Starliner failed.  The planned re-try was on the pad in late July of this year and expected to fly within days when problems with stuck valves were discovered.  The Starliner had to be taken down and moved to a facility to identify the cause of the problem and fix it, with the flight not now expected until late this year at the earliest.  The extra costs are being borne by Boeing and have not been revealed, but in principle should be added to the $3,008 million cost figure in the chart above.  But they have been kept confidential, so we do not know what that addition would be.

In contrast to the cost to NASA of $1.845 billion for the SpaceX Crew Dragon and $3.0 billion to Boeing for its Starliner (under the new, competition-based, contractual approach), the amount NASA has spent on the Lockheed Orion spacecraft (under its traditional contractual approach) has been far higher.  More than $19.0 billion has already been spent through FY2021, and Orion is still in development.  Other than the early and partial test in 2014, the Orion has yet to be fully tested in flight.  The first such test is currently scheduled, along with the first test of the SLS, for later this year.  At best, it will not be operational until 2023, although more likely later.  Just adding what is anticipated will be needed to continue the development of Orion through FY2023, the total that NASA will spend on it will have reached $21.8 billion.  But the FY2023 cut-off date is in part arbitrary.  While the Orion capsule should be flying by then, there will still be additional expenditures to finalize its design and for further development.  These would add to the overall cost, but we do not know what those are expected to be.

Including costs just through FY2023, the cost of developing Orion is already close to 12 times what it has cost to develop the SpaceX Crew Dragon, and over 7 times what it has cost NASA to develop the Boeing Starliner.  While there are of course differences between the spacecraft, and it may be argued that the Orion is more capable, it is hard to see that such differences account for a cost that is 12 times that of the SpaceX Crew Dragon, or even 7 times the cost of the Boeing Starliner.  And as noted above, the greater capabilities of the Orion derive primarily from the European Service Module, whose costs are not included in the $21.8 billion figure for Orion.

The operational costs of the Orion will also be higher, using for comparability what it would be for a flight to earth orbit.  The most relevant figure is the cost per seat, and the calculations assume four seats will be filled on each flight (as NASA in fact plans, for both the missions to take astronauts to the ISS as well as for the Orion missions):

The costs include not only the cost of using the spacecraft itself, but also, and importantly, the cost of the rocket used to launch the spacecraft into orbit.  The costs of the rockets were included in the NASA contracts with SpaceX and Boeing, as the contracts were for the delivery of crews to the ISS.

The per-seat costs for the SpaceX Crew Dragon and Boeing Starliner contracts were calculated following the approach the NASA Inspector General used in its November 2019 report, using its estimate of the operational portion of the contracts with SpaceX and Boeing.  They come to $54.2 million per seat on the SpaceX Crew Dragon and $87.5 million on the Boeing Starliner (before rounding – in the Inspector General’s report one will see rounded figures of $55 million and $90 million, respectively).

The costs of building a new Orion capsule (which can then be reused to some degree) and flying it can be estimated from the announced NASA contracts with Lockheed for future missions.  In September 2019, NASA announced that it had awarded Lockheed an “Orion Production and Operations Contract”, where NASA would pay Lockheed for the Orion spacecraft for use on planned Artemis missions, but where the Orion spacecraft themselves would be reused to a varying degree that will rise over time.  The contracts for the Orions to be used in the first two flights (Artemis I and II) were signed some time before, and one can view these as part of the development costs (as these will be missions testing the Orion capsules).  The September 2019 announcement was that Lockheed would be paid a total of $2.7 billion for the next three missions (Artemis III, IV, and V), with re-use started to a limited degree.  Some high-value electronics, primarily, from the Orion used on the Artemis II mission would be re-used in the capsule for Artemis V.  Future costs would then fall further with greater re-use, but this should still be seen as speculative at this point.

Based on the $2.7 billion figure for the three Artemis missions following the first two, and with four seats on each of those three flights, the per-seat cost for the Orion alone would be $225 million.  To this one would need to add, for comparability, the cost of the rocket launcher.  The Artemis missions would use the SLS, which as discussed above, will cost $2.0 billion per flight.  This would add $500 million per seat (with the four seats per flight), bringing the total to $725 million per seat.

While that is indeed what the cost would be for the lunar missions, it is not an appropriate comparator to the costs of the Crew Dragon and Spaceliner capsules as the rockets they need are just for earth orbit.  For this reason, for the figure in the chart I have used the per-seat cost of what a launch on an Atlas V would be.  The Atlas V is the vehicle that will be used for the Boeing Starliner, and it has a comparable weight to the Orion (excluding the Orion European Service Module).  That per-seat cost, for a launch to earth orbit, would be $266.25 million.

Based on these figures, the operational cost per seat of an Orion capsule is almost 5 times what the per-seat cost is for the SpaceX Crew Dragon, and 3 times the cost on a Boeing Starliner.  These are huge differences.

F.  Conclusion

There was vehement opposition to Obama’s proposal to follow a more commercial approach to ferry crew to the ISS.  This came not only from former astronauts – who as pilots and engineers were taking a position on an issue they really did not know much about, but who were comfortable with the traditional approach.  Of more immediate importance, it came from certain politicians – in particular in the Senate.  The politicians opposed to the Obama proposals, led by Senator Shelby of Alabama, were also mostly (although not entirely) conservative Republican politicians who on other issues claimed to be in favor of free-market approaches.  Yet not here.

We now know that SpaceX delivered on the contracts, with now routine delivery of both cargo and crew to the ISS. Indeed, as I complete this post, an all-private crew of four have just returned from a three-day flight to earth orbit on a SpaceX Crew Dragon spacecraft (launched on a Falcon 9).  The flight was a complete success, and showed that flights of people to orbit are no longer restricted to a very small number of large nation-states (specifically Russia, the US, and China).  NASA certainly played an important role in supporting the development of the Falcon 9 and the Crew Dragon, as discussed above, but these flights are now private.  If Senator Shelby and his (mostly) Republican colleagues had gotten their way, this never would have happened.  The hope that this would follow was, however, an explicit part of the plan when the Obama administration proposed that NASA contract with private providers to bring crew to the ISS.  And it has.

Boeing is not yet at the point that SpaceX has reached, with its Starliner capsule still to be proven, but it appears likely that they will have worked through their problems by sometime next year (approximately three years after SpaceX succeeded with its first tests).  Meanwhile, even though work on the Orion spacecraft began in 2005 and work on the SLS began in 2011, both the SLS and Orion are still to be tested.  The SLS was supposed to be operational in 2016, but its first operational flight is now scheduled for 2023 and will almost certainly be later.  The key components of the SLS (the engines and the strap-on solid rocket boosters) were all taken from the Space Shuttle or even earlier designs.  It is not at all clear why this should have taken so long.

We also can now work out reasonable estimates of the costs, and can compare them to the costs under the more commercial approach.  In terms of the development costs (planned through FY2023), the SLS will cost an astonishing 65 times what it cost to develop the Falcon Heavy.  The SLS will be able to carry a heavier load, but only about 50% more than what a Falcon Heavy can carry.  It is difficult to see why this would cost 65 times as much.  And it is not just the SLS.  The cost of developing the Ares I, including what had been planned to be spent through FY2014 (when it still would not have been fully ready) would have been 42 times what the similarly sized Falcon 9 cost to develop.  These are mind-boggling high multiples.

The operational costs per launch are also high multiples of what the costs are for commercially developed rockets.  The cost of a launch of the SLS will be 22 times the cost of the recoverable Falcon Heavy.  While it can carry more, the cost per kilogram to low earth orbit will be 13 times higher for the SLS (excluding its development costs) compared to that for the recoverable version of the Falcon Heavy, and 9 times higher when compared to the expendable version.

Similarly, it is expected that development of the Orion capsule (not counting the cost of the Service Module, that the Europeans are developing as their contribution to the program) will by FY2023 have cost almost 12 times the cost of developing the SpaceX Crew Dragon.  And the operational cost per seat will be 5 times higher for the Orion than the cost for the Crew Dragon flights, and 3 times higher than for the Boeing Starliner.

The evidence is clear.  Why then, are the conservative Republican Senators and Members of Congress (as well as a few Democrats, including, significantly, Representative Eddie Bernice Johson, D-Texas, who is the current Chair of the House Committee on Science, Space, and Technology) so opposed to NASA entering into commercial contracts with SpaceX and others?  The answer, clearly, is the politics of it.  Spending billions of dollars on such hardware keeps many employed, and many of those jobs are in high-wage engineering and technical positions.  From this perspective, the high costs are not a flaw but a feature.

This is not only a waste.  Since budgets are not unlimited such waste has also meant long delays in achieving the intended goals.  The space program has traditionally enjoyed much goodwill in the general population.  But such waste, as well as the resulting long delays in achieving the intended aims, could destroy that goodwill.

That would be unfortunate, although not the end of the world.  One does, however, see the same issues with the military budget, where the stakes are higher.  And the costs are also much higher, with major military programs now costing in the hundreds of billions of dollars rather than the tens of billions for the space program.  An example has been the development of the F-35 fighter jet.  The program began in 1992, the first prototypes (of Lockheed and Boeing) flew in 2000, Lockheed won the contract in 2001, the first planes were manufactured in 2011, and the first squadron became operational in 2015.  That is, it took 23 years to go from the initial design and conceptual work to the first operational unit.  Furthermore, it is expected to be the most expensive military program in history, with over $400 billion expected to be spent to acquire the planes and a further $1.1 trillion to keep them operational over a 50-year life cycle for the program.  That is a total cost of $1.5 trillion, and other estimates place the cost at $1.6 or even $1.7 trillion (and no one will know for sure what it will be until this is all history).

The factors driving such high costs as well multi-decade time frames to go from concept to operations are undoubtedly similar to those that have driven this for major NASA programs such as the Orion and the SLS.  Spending more is politically attractive to those politicians that represent the states and districts where the spending will be done.  But for the military, the stakes (and not simply the dollar amounts) are a good deal higher than they are for the space program.

But it should also be recognized that the cure for this is likely to be more complicated and difficult than what NASA has been able to achieve through changes to its traditional contracting and procurement model.  Industry capabilities will need to be developed, with greater competition introduced.  In major areas there are now often only two or three manufacturers, and sometimes only one, with the capabilities required.

We do, however, now have examples of what can be done.  ULA (United Launch Alliance) had a monopoly on heavy-lift launch vehicles following its creation in 2006 by combining what had been the competing launch divisions of Boeing and Lockheed.  SpaceX entered that market, and we saw above what resulted.  If such progress is possible with something as complex as a heavy-lift rocket, it should be possible in at least some other areas of military procurement as well.

 

=======================================================

Annex:  Why Cost Comparisons of Rockets and Spacecraft are Difficult to Make

One might think that comparisons of costs of rockets as well as spacecraft would be straightforward.  But they are not, for a number of reasons:

a)  First, different sources will often provide different estimates.  There is no single, authoritative, source that one can cite, and one will often see differing estimates in different sources.  Recognizing this, for the purposes here – which are to compare the costs where there is competition (primarily SpaceX) to the costs under NASA procurement from the traditional contractors – I have sought to use estimates that are on the high side of what has been published for the costs of the SpaceX vehicles, and on the low side for the costs of the traditional NASA contractors.  Despite this, the SpaceX costs are still far lower.

b)  An important reason there are these different cost estimates from different sources is that the information on what the costs actually are have often been kept confidential.  SpaceX is the most transparent, but even here what they publish on their website ($62 million for a Falcon 9 launch, and $90 million for a Falcon Heavy) should really be viewed more as a “list price” that will be negotiated.  For NASA, full transparency on the costs can be embarrassing.  For commercial providers, less-than-transparent cost figures may be seen as helpful when they engage in negotiations with those who would purchase their services.

c)  Which brings us to a third factor, which is negotiating power.  Just like when buying a car, the price that will be paid will depend in part on the relative negotiating powers of the parties.  When there is a low-cost competing supplier (such as SpaceX), there will be pressure on higher-cost suppliers to lower their prices.  One has seen this with the prices being charged for launches of the Delta IV Heavy and Atlas V rockets.  Negotiating power will also depend on whether one will be a repeat customer or just a one-time user.  For these reasons there will not be one, unique, price that can be cited as the “cost” of launching a particular rocket.  It will depend on the negotiations.

d)  And this also leads to the distinction between the cost of a rocket launch and the price charged.  Ideally, what one wants as the basis for comparison is the cost.  However, the best information available will often be the price that some customer paid.  But that price may include a substantial profit margin if that customer did not have much negotiating power to bargain down the price.  It might also work the other way.  The cost of developing and launching the Boeing Starliner capsule, which was discussed above, is based on what NASA is paying.  Yet because of the repeated problems with the development of the Starliner, Boeing is certainly losing money on that fixed-price contract.  How much Boeing is losing has not been disclosed, and indeed since there are continued problems they do not yet know themselves how much it will have cost in the end.  Hence, in a comparison of the cost of delivering astronauts to the ISS the true cost of the Boeing Starliner will be something more than what NASA is paying, and it is that higher cost which really should be the basis for comparison with the cost of the SpaceX Crew Dragon alternative.

e)  The common basis for comparison is also inherently problematic.  While the standard measure for a rocket (and the one used here) is how many kilograms of payload can be lifted to low earth orbit, specific situations are more complicated.  Depending on the mission, one will want to place the payload into different types of orbits, including different altitudes (from 100 miles to several hundred miles, and still be considered “low”), different angles to the equator (the higher the angle, the higher the share of the world’s land area that would be covered by the satellite over some period, such as a month), and perhaps different requirements on how circular the orbit needs to be (the difference between the highest point in the orbit and the lowest).  There will be different thrust (hence fuel) requirements for each of these, possibly different payload weights that can be handled, and possibly other differences, all of which would end up being reflected in the negotiated price for the launch.

f)  Different payloads also have different requirements on how they must be handled, how they need to be attached to the rocket, the requirements on the fairings (the nose cone shell surrounding the payload to protect it at launch, which is jettisoned once orbital altitude is reached), and so on.  Military launches are also more expensive (and charged accordingly) due to the secrecy arrangements the Defense Department requires.

g)  Different boosters will also have different capabilities.  For a launch into low earth orbit these capabilities might not all be needed, but they may still be reflected in the costs.  The most obvious is the size of the payload.  If the weight is more than a smaller rocket can handle (and the payload cannot be divided into two or more smaller satellites), then they will have to use a larger booster even if the cost per kilogram is higher.

h)  The calculations of the cost per kilogram of payload are also based on the maximum payload each rocket can handle.  But it would be coincidental that any particular payload will be exactly at this maximum weight.  The cost per kilogram will then be higher for a payload that weighs less than this maximum.  While there may be some savings in total costs in launching a payload that is less than the maximum a rocket can handle (somewhat less fuel will be needed, for example), such savings will be modest.  For this reason, SpaceX and others will typically offer to sell, at a low price, such extra space to those with just small satellites, piggy-backing on larger satellites that do not need to use up the full payload capacity of the rocket.  The entity with the larger satellite might then receive a discount from what the cost otherwise would have been.

i)  Finally, one should recognize that there are normally several variants of each launch vehicle, with somewhat different capabilities and costs.  To the extent possible, all the cost estimates in this post are for a single, recent, variant of the vehicles.  The Falcon 9 launch vehicle, for example, is now at what they have named the “Block 5” variant, and the costs of that version are what have been used here.  Earlier versions of the Falcon 9 were labeled v1.0, v1.1, v1.2 or “Full Thrust” (and sometimes referred to as Block 3), and Block 4.

 

There are therefore a number of reasons why one needs to be cautious in judging reported cost differences between various rockets as well as spacecraft.  As noted in the text, cost differences of 10 or 20% certainly, and indeed even 40 or 50%, should not be seen as necessarily significant.  But as the charts show, the cost differences are far higher than this, with the costs of the traditional contractors following the traditional NASA procurement processes many times the costs obtained under the more competitive process the Obama administration introduced to manned space flight (at substantial political cost).

The Ridership Forecasts for the Baltimore-Washington SCMAGLEV Are Far Too High

The United States desperately needs better public transit.  While the lockdowns made necessary by the spread of the virus that causes Covid-19 led to sharp declines in transit use in 2020, with (so far) only a partial recovery, there will remain a need for transit to provide decent basic service in our metropolitan regions.  Lower-income workers are especially dependent on public transit, and many of them are, as we now see, the “essential workers” that society needs to function.  The Washington-Baltimore region is no exception.

Yet rather than focus on the basic nuts and bolts of ensuring quality services on our subways, buses, and trains, the State of Maryland is once again enamored with using the scarce resources available for public transit to build rail lines through our public parkland in order to serve a small elite.  The Purple Line light rail line was such a case.  Its dual rail lines will serve a narrow 16-mile corridor, passing through some of the richest zip codes in the nation, but destroying precious urban parkland.  As was discussed in an earlier post on this blog, with what will be spent on the Purple Line one could instead stop charging fares on the county-run bus services in the entirety of the two counties the Purple Line will pass through (Montgomery and Prince George’s), and at the same time double those bus services (i.e. double the lines, or double the service frequency, or some combination).

The administration of Governor Hogan of Maryland nonetheless pushed the Purple Line through, although construction has now been halted for close to a year due to cost overruns leading the primary construction contractor to withdraw.  Hogan’s administration is now promoting the building of a superconducting, magnetically-levitating, train (SCMAGLEV) between downtown Baltimore and downtown Washington, DC, with a stop at BWI Airport.  Over $35 million has already been spent, with a massive Draft Environmental Impact Statement (DEIS) produced.  As required by federal law, the DEIS has been made available for public comment, with comments due by May 24.

It is inevitable that such a project will lead to major, and permanent, environmental damage.  The SCMAGLEV would travel partially in tunnels underground, but also on elevated pylons parallel to the Baltimore-Washington Parkway (administered by the National Park Service).  The photos at the top of this post show what it would look like at one section of the parkway.  The question that needs to be addressed is whether any benefits will outweigh the costs (both environmental and other costs), and ridership is central to this.  If ridership is likely to be well less than that forecast, the whole case for the project collapses.  It will not cover its operating and maintenance costs, much less pay back even a portion of what will be spent to build it (up to $17 billion according to the DEIS, but likely to be far more based on experience with similar projects).  Nor would the purported economic benefits then follow.

I have copied below comments I submitted on the DEIS forecasts.  Readers may find them of interest as this project illustrates once again that despite millions of dollars being spent, the consulting firms producing such analyses can get some very basic things wrong.  The issue I focus on for the proposed SCMAGLEV is the ridership forecasts.  The SCMAGLEV project sponsors forecast that the SCMAGLEV will carry 24.9 million riders (one-way trips) in 2045.  The SCMAGLEV will require just 15 minutes to travel between downtown Baltimore and downtown Washington (with a stop at BWI), and is expected to charge a fare of $120 (roundtrip) on average and up to $160 at peak hours.  As one can already see from the fares, at best it would serve a narrow elite.

But there is already a high-speed train providing premier-level service between Baltimore and Washington – the Acela service of Amtrak.  It takes somewhat longer – 30 minutes currently – but its fare is also somewhat lower at $104 for a roundtrip, plus it operates from more convenient stations in Baltimore and Washington.  Importantly, it operates now, and we thus have a sound basis for forecasts of what its ridership might be in the future.

One can thus compare the forecast ridership on the proposed SCMAGLEV to the forecast for Acela ridership (also in the DEIS) in a scenario of no SCMAGLEV.  One would expect the forecasts to be broadly comparable.  One could allow that perhaps it might be somewhat higher on the SCMAGLEV, but probably less than twice as high and certainly less than three times as high.  But one can calculate from figures in the DEIS that the forecast SCMAGLEV ridership in 2045 would be 133 times higher than what they forecast Acela ridership would be in that year (in a scenario of no SCMAGLEV).  For those going just between downtown Baltimore and downtown Washington (i.e. excluding BWI travelers), the forecast SCMAGLEV ridership would be 154 times higher than what it would be on the comparable Acela.  This is absurd.

And it gets worse.  For reasons that are not clear, the base year figures for Acela ridership in the Baltimore-Washington market are more than eight times higher in the DEIS than figures that Amtrak itself has produced.  It is possible that the SCMAGLEV analysts included Acela riders who have boarded north of Baltimore (such as in Philadelphia or New York) and then traveled through to DC (or from DC would pass through Baltimore to ultimate destinations further north).  But such travelers should not be included, as the relevant travelers who might take the SCMAGLEV would only be those whose trips begin in either Baltimore or in Washington and end in the other metropolitan area.  The project sponsors have made no secret that they hope eventually to build a SCMAGLEV line the full distance between Washington and New York, but that would at a minimum be in the distant future.  It is not a source of riders included in their forecasts for a Baltimore to Washington SCMAGLEV.

The Amtrak forecasts of what it expects its Acela ridership would be, by market (including between Baltimore and Washington) and under various investment scenarios, come from its recent NEC FUTURE (for Northeast Corridor Future) study, for which it produced a Final Environmental Impact Statement.  Using Amtrak’s forecasts of what its Acela ridership would be in a scenario where major investments allowed the Acela to take just 20 minutes to go between Baltimore and Washington, the SCMAGLEV ridership forecasts were 727 times as high (in 2040).  That is complete nonsense.

My comment submitted on the DEIS, copied below, goes further into these results and discusses as well how the SCMAGLEV sponsors could have gotten their forecasts so absurdly wrong.  But the lesson here is that the consultants producing such forecasts are paid by project sponsors who wish to see the project built.  Thus they have little interest in even asking the question of why they have come up with an estimate that 24.9 million would take a SCMAGLEV in 2045 (requiring 15 minutes on the train itself to go between Baltimore and DC) while ridership on the Acela in that year (in a scenario where the Acela would require 5 minutes more, i.e. 20 minutes, and there is no SCMAGLEV) would be about just 34,000.

One saw similar issues with the Purple Line.  An examination of the ridership forecasts made for it found that in about half of the transit analysis zone pairs, the predicted ridership on all forms of public transit (buses, trains, and the Purple Line as well) was less than what they forecast it would be on the Purple Line only.  This is mathematically impossible.  And the fact that half were higher and half were lower suggests that the results they obtained were basically just random.  They also forecast that close to 20,000 would travel by the Purple Line into Bethesda each day but only about 10,000 would leave (which would lead to Bethesda’s population exploding, if true).  The source of this error was clear (they mixed up two formats for the trips – what is called the production/attraction format with origin/destination), but it mattered.  They concluded that the Purple Line had to be a rail line rather than a bus service in order to handle their predicted 20,000 riders each day on the segment to Bethesda.

It may not be surprising that private promoters of such projects would overlook such issues.  They may stand to gain (i.e. from the construction contracts, or from an increase in land values next to station sites), even though society as a whole loses.  Someone else (government) is paying.  But public officials in agencies such as the Maryland Department of Transportation should be looking at what is the best way to ensure quality and affordable transit services for the general public.  Problems develop once the officials see their role as promoters of some specific project.  They then seek to come up with a rationale to justify the project, and see their role as surmounting all the hurdles encountered along the way.  They are not asking whether this is the best use of scarce public resources to address our very real transit needs.

A high-speed magnetically-levitating train (with superconducting magnets, no less), may look attractive.  But officials should not assume such a shiny new toy will address our transit issues.

—————————————————————————————————

May 22, 2021

Comment Submitted on the DEIS for SCMAGLEV

The Ridership Forecasts Are Far Too High

A.  Introduction

I am opposed to the construction of the proposed SCMAGLEV project between Baltimore and Washington, DC.  A key issue for any such system is whether ridership will be high enough to compensate for the environmental damage that is inevitable with such a project.  But the ridership forecasts presented in the DEIS are hugely flawed.  They are far too high and simply do not meet basic conditions of plausibility.  At more plausible ridership levels, the case for such a project collapses.  It will not cover its operating costs, much less pay back any of the investment (of up to $17 billion according to the DEIS, but based on experience likely to be far higher).  Nor will the purported positive economic benefits then follow.  But the damage to the environment will be permanent.

Specifically, there is rail service now between Baltimore and Washington, at three levels of service (the high-speed Acela service of Amtrak, the regular Amtrak Regional service, and MARC).  Ridership on the Acela service, as it is now and with what is expected with upgrades in future years, provides a benchmark that can be used.  While it could be argued that ridership on the proposed SCMAGLEV would be higher than ridership on the Acela trains, the question is how much higher.  I will discuss below in more detail the factors to take into account in making such a comparison, but briefly, the Acela service takes 30 minutes today to go between Baltimore and Washington, while the SCMAGLEV would take 15 minutes.  But given that it also takes time to get to the station and on the train, and then to the ultimate destination at the other end, the time savings would be well less than 50%.  The fare would also be higher on the SCMAGLEV (at an average, according to the DEIS, of $120 for a round-trip ticket but up to $160 at peak hours, versus an average of $104 on the Acela).  In addition, the stations the SCMAGLEV would use for travel between downtown Baltimore and downtown Washington are less conveniently located (with poorer connections to local transit) than the Acela uses.

Thus while it could be argued that the SCMAGLEV would attract more riders than the Acela, even this is not clear.  But being generous, one could allow that it might attract somewhat more riders.  The question is how many.  And this is where it becomes completely implausible.  Based on the ridership forecasts in the DEIS, for both the SCMAGLEV and for the Acela (in a scenario where the SCMAGLEV is not built), the SCMAGLEV in 2045 would carry 133 times what ridership would be on the Acela.  Excluding the BWI ridership on both, it would be 154 times higher.  There is no way to describe this other than that it is just nonsense.  And with other, likely more accurate, forecasts of what Acela ridership would be in the future (discussed below) the ratios become higher still.

Similarly, if the SCMAGLEV will be as attractive to MARC riders as the project sponsors forecast it will be, then most of those MARC riders would now be on the modestly less attractive Acela.  But they aren’t.  The Acela is 30 minutes faster than MARC (the SCMAGLEV would be 45 minutes faster), yet 28 times as many riders choose MARC over Acela between Baltimore and Washington.  I suspect the fare difference ($16 per day on MARC, vs. $104 on the Acela) plays an important role.  The model used could have been tested by calculating a forecast with their model of what Acela ridership would be under current conditions, with this then compared this to what the actual figures are.  Evidently this was not done.  Had they, their predicted Acela ridership would likely have been a high multiple of the actual and it would have been clear that their modeling framework has problems.

Why are the forecasts off by orders of magnitude?  Unfortunately, given what has been made available in the DEIS and with the accompanying papers on ridership, one cannot say for sure.  But from what has been made available, there are indications of where the modeling approach taken had issues.  I will discuss these below.

In the rest of this comment I will first discuss the use of Acela service and its ridership (both the actual now and as projected) as a basis for comparison to the ridership forecasts made for the SCMAGLEV.  They would be basically similar services, where a modest time saving on the SCMAGLEV (15 minutes now, but only 5 minutes in the future if further investments are made in the Acela service that would cut its Baltimore to DC time to just 20 minutes) is offset by a higher fare and less convenient station locations.  I will then discuss some reasons that might explain why the SCMAGLEV ridership forecasts are so hugely out-of-line with what plausible numbers might be.

B.  A Comparison of SCMAGLEV Ridership Forecasts to Those for Acela  

The DEIS provides ridership forecasts for the SCMAGLEV for both 2030 (several years after the DEIS says it would be opened, so ridership would then be stable after an initial ramping up) and for a horizon year of 2045.  I will focus here on the 2045 forecasts, and specifically on the alternative where the destination station in Baltimore is Camden Yards.  The DEIS also has forecasts for ridership in an alternative where the SCMAGLEV line would end in the less convenient Cherry Hill neighborhood of Baltimore, which is significantly further from downtown and with poorer connections to local transit options.  The Camden Yards station is more comparable to Penn Station – Baltimore, which the Acela (and Amtrak Regional trains and one of the MARC lines) use.  Penn Station – Baltimore has better local transit connections and would be more convenient for many potential riders, but this will of course depend on the particular circumstances of the rider – where he or she will be starting from and where their particular destination will be.  It will, in particular, be more convenient for riders coming from North and Northeast of Baltimore than Camden Yards would be.  And those from South and Southwest of Baltimore would be more likely to drive directly to the DC region than try to reach Camden Yards, or they would alight at BWI.

The DEIS also provides forecasts of what ridership would be on the existing train services between Baltimore and Washington:  the Acela services (operated by Amtrak), the regular Amtrak Regional trains, and the MARC commuter service operated by the State of Maryland.  Note also that the 2045 forecasts for the train services are for both a scenario where the SCMAGLEV is not built and then what they forecast the reduced ridership would be with a SCMAGLEV option.  For the purposes here, what is of interest is the scenario with no SCMAGLEV.

The SCMAGLEV would provide a premium service, requiring 15 minutes to go between downtown Baltimore and downtown Washington, DC.  Acela also provides a premium service and currently takes 30 minutes, while the regular Amtrak Regional trains take 40 to 45 minutes and MARC service takes 60 minutes.  But the fares differ substantially.  Using the DEIS figures (with all prices and fares expressed in base year 2018 dollars), the SCMAGLEV would charge an average fare of $120 for a round-trip (Baltimore-Washington), and up to $160 for a roundtrip at peak times.  The Acela also has a high fare for its also premium service, although not as high as SCMAGLEV, charging an average of $104 for a roundtrip (using the DEIS figures).  But Amtrak Regional trains charge only $34 for a similar roundtrip, and MARC only $16.

Acela service thus provides a reasonable basis for comparison to what SCMAGLEV would provide, with the great advantage that we know now what Acela ridership has actually been.  This provides a firm base for a forecast of what Acela ridership would be in a future year in a scenario where the SCMAGLEV is not built.  And while the ridership on the two would not be exactly the same, one should expect them to be in the same ballpark.

But they are far from that:

  DEIS Forecasts of SCMAGLEV vs. Acela Ridership, Annual Trips in 2045

Route

SCMAGLEV Trips

Acela Trips

Ratio

Baltimore – DC only

19,277,578

125,226

154 times as much

All, including BWI

24,938,652

187,887

133 times as much

Sources:  DEIS, Main Report Table 4.2-3; and Table D-4-48 of Appendix D.4 of the DEIS

Using estimates just from the DEIS, the project sponsor is forecasting that annual (one-way) trips on the SCMAGLEV in 2045 would be 133 times what they would be in that year on the Acela (in a scenario where the SCMAGLEV is not built).  And it would be 154 times as much for the Baltimore – Washington riders only.  This is nonsense.  One could have a reasonable debate if the SCMAGLEV figures were twice as high, and maybe even if they were three times as high.  But it is absurd that they would be 133 or 154 times as high.

And it gets worse.  The figures above are all taken from the DEIS.  But the base year Acela ridership figures in the DEIS (Appendix D.4, Table D.4-45) differ substantially from figures Amtrak itself has produced in its recent NEC FUTURE study.  This review of future investment options in Northeast Corridor (Washington to Boston) Amtrak service was concluded in July 2017.  As part of this it provided forecasts of what future Acela ridership would be under various alternatives, including one (its Alternative 3) where Acela trains would be substantially upgraded and require just 20 minutes for the trip between downtown Baltimore and downtown Washington, DC.  This would be quite similar to what SCMAGLEV service would be.

But for reasons that are not clear, the base year figures for Acela ridership between Baltimore and Washington differ substantially between what the SCMAGLEV DEIS has and what NEC FUTURE has.  The figure in the NEC FUTURE study (for a base year of 2013) puts the number of riders (one-way) between Baltimore and Washington (and not counting those who boarded north of Baltimore, at Philadelphia or New York for example, and then rode through to Washington, and similarly for those going from Washington to Baltimore) at just 17,595.  The DEIS for the SCMAGLEV put the similar Acela ridership (for a base year of 2017) at 147,831 (calculated from Table D.4-45, of Appendix D.4).  While the base years differ (2013 vs. 2017), the disparity cannot be explained by that.  It is far too large.  My guess would be that the DEIS counted all Acela travelers taking up seats between Baltimore and Washington, including those who alighted north of Baltimore (or whose destination from Washington was north of Baltimore), and not just those travelers traveling solely between Washington and Baltimore.  But the SCMAGLEV will be serving only the Baltimore-Washington market, with no interconnections with the train routes coming from north of Baltimore.

What was the source of the Acela ridership figure in the DEIS of 147,831 in 2017?  That is not clear.  Table D.4-45 of Appendix D.4 says that its source is Table 3-10 of the “SCMAGLEV Final Ridership Report”, dated November 8, 2018.  But that report, which is available along with the other DEIS reports (with a direct link at https://bwmaglev.info/index.php/component/jdownloads/?task=download.send&id=71&catid=6&m=0&Itemid=101), does not have a Table 3-10.  Significant portions of that report were redacted, but in its Table of Contents no reference is shown to a Table 3-10 (even though other redacted tables, such as Tables 5-2 and 6-3, are still referenced in the Table of Contents, but labeled as redacted).

One can only speculate on why there is no Table 3-10 in the Final Ridership Report.  Perhaps it was deleted when someone discovered that the figures reported there, which were then later used as part of the database for the ridership forecast models, were grossly out of line with the Amtrak figures.  The Amtrak figure for Acela ridership for Baltimore-Washington passengers of 17,595 (in 2013) is less than one-eighth of the figure on Acela ridership shown in the DEIS or 147,831 (in 2017).

It can be difficult for an outsider to know how many of those riding on the Acela between Washington and Baltimore are passengers going just between those two cities (as well as BWI).  Most of the passengers riding on that segment will be going on to (or coming from) cities further north.  One would need access to ticket sales data.  But it is reasonable to assume that Amtrak itself would know this, and therefore that the figures in the NEC FUTURE study would likely be accurate.  Furthermore, in the forecast horizon years, where Amtrak is trying to show what Acela (and other rail) ridership would grow to with alternative investment programs, it is reasonable to assume that Amtrak would provide relatively optimistic (i.e. higher) estimates, as higher estimates are more likely to convince Congress to provide the funding that would be required for such investments.

The Amtrak figures would in any case provide a suitable comparison to what SCMAGLEV’s future ridership might be.  The Amtrak forecasts are for 2040, so for the SCMAGLEV forecasts I interpolated to produce an estimate for 2040 assuming a constant rate of growth between the forecast SCMAGLEV ridership in 2030 and that for 2045.  Both the NEC FUTURE and SCMAGLEV figures include the stop at BWI.

    Forecasts of SCMAGLEV (DEIS) vs. Acela (NEC FUTURE) Ridership between Baltimore and Washington, Annual Trips in 2040 

Alternative

SCMAGLEV Trips

Acela Trips

Ratio

No Action

22,761,428

26,177

870 times as much

Alternative 1

22,761,428

26,779

850 times as much

Alternative 2

22,761,428

29,170

780 times as much

Alternative 3

22,761,428

31,291

727 times as much

Sources:  SCMAGLEV trips interpolated from figures on forecast ridership in 2030 and 2045 (Camden Yards) in Table 4.2-3 of DEIS.  Acela trips from NEC FUTURE Final EIS, Volume 2, Appendix B.08.

The Acela ridership figures are those estimated under various investment scenarios in the rail service in the Northeast Corridor.  NEC FUTURE examined a “No Action” scenario with just minimal investments, and then various alternative investment levels to produce increasingly capable services.  Alternative 3 (of which there were four sub-variants, but all addressing alternative investments between New York and Boston and thus not affecting directly the Washington-Baltimore route) would upgrade Acela service to the extent that it would go between Baltimore and Washington in just 20 minutes.  This would be very close to the 15 minutes for the SCMAGLEV.  Yet even with such a comparable service, the SCMAGLEV DEIS is forecasting that its service would carry 727 times as many riders as what Amtrak has forecast for its Acela service (in a scenario where there is no SCMAGLEV).  This is complete nonsense.

To be clear, I would stress again that the forecast future Acela ridership figures are a scenario under various possible investment programs by Amtrak.  The investment program in Alternative 3 would upgrade Acela service to a degree where the Baltimore – Washington trip (with a stop at BWI) would take just 20 minutes.  The NEC FUTURE study forecasts that in such a scenario the Baltimore-Washington ridership on Acela would total a bit over 31,000 trips in the year 2040.  In contrast, the DEIS for the SCMAGLEV forecasts that there would in that year be close to 23 million trips taken on the similar SCMAGLEV service, requiring 15 minutes to make such a trip.  Such a disparity makes no sense.

C.  How Could the Forecasts be so Wrong?

A well-known consulting firm, Louis Berger, prepared the ridership forecasts, and their “Final Ridership Report” dated November 8, 2018, referenced above, provides an overview on the approach they took.  Unfortunately, while I appreciate that the project sponsor provided a link to this report along with the rest of the DEIS (I had asked for this, having seen references to it in the DEIS), the report that was posted had significant sections redacted.  Due to those redactions, and possibly also limitations in what the full report itself might have included (such as summaries of the underlying data), it is impossible to say for sure why the forecasts of SCMAGLEV ridership were close to three orders of magnitude greater than what ridership has been and is expected to be on comparable Acela service.

Thus I can only speculate.  But there are several indications of what may have led the SCMAGLEV estimates to be so out of line with ridership on a service that is at least broadly comparable.  Specifically:

1)  As noted above, there were apparent problems in assembling existing data on rail ridership for the Baltimore-Washington market, in particular for the Acela.  The ridership numbers for the Acela in the DEIS were more than eight times higher in their base year (2017) than what Amtrak had in an only slightly earlier base year (2013).  The ridership numbers on Amtrak Regional trains (for Baltimore-Washington riders) were closer but still substantially different:  409,671 in Table D.4-45 of the DEIS (for 2017), vs. 172,151 in NEC FUTURE (for 2013).

Table D.4-45 states that its source for this data on rail ridership is a Table 3-10 in the Final Ridership Report of November 8, 2018.  But as noted previously, such a table is not there – it was either never there or it was redacted.  Thus it is impossible to determine why their figures differ so much from those of Amtrak.  But the differences for the Acela figures (more than a factor of eight) are huge, i.e. close to an order of magnitude by itself.  While it is impossible to say for sure, my guess (as noted above) is that the Acela ridership numbers in the DEIS included travelers whose trip began, or would end, in destinations north of Baltimore, who then traveled through Baltimore on their way to, or from, Washington, DC.  But such travelers are not part of the market the SCMAGLEV would serve.

2)  In modeling the choice those traveling between Baltimore and Washington would have between SCMAGLEV and alternatives, the analysts collapsed all the train options (Acela, Amtrak Regional, and MARC) into one.  See page 61 of the Ridership Report.  They create a weighted average for a single “train” alternative, and they note that since (in their figures) MARC ridership makes up almost 90% of the rail market, the weighted averages for travel time and the fare will be essentially that of MARC.

Thus they never looked at Acela as an alternative, with a service level not far from that of SCMAGLEV.  Nor do they even consider the question of why so many MARC riders (67.5% of MARC riders in 2045 if the Camden Yards option is chosen – see page D-56 of Appendix D-4 of the DEIS) are forecast to divert to the SCMAGLEV, but are not doing so now (nor in the future) to Acela.  According to Table D-45 of Appendix D.4 of the DEIS, in their data for their 2017 base year, there are 28 times as many MARC riders as on Acela between downtown Baltimore and downtown Washington, and 20 times as many with those going to and from the BWI stop included.  Evidently, they do not find the Acela option attractive.  Why should they then find the SCMAGLEV train attractive?

3)  The answer as to why MARC riders have not chosen to ride on the Acela almost certainly has something to do with the difference in the fares.  A round-trip on MARC costs $16 a day.  A round trip on Acela costs, according to the DEIS, an average of $104 a day.  That is not a small difference.  For someone commuting 5 days a week and 50 weeks a year (or 250 days a year), the annual cost on MARC would be $4,000 but $26,000 a year on the Acela.  And it would be an even higher $30,000 a year on the SCMAGLEV (based on an average fare of $120 for a round trip), and $40,000 a year ($160 a day) at peak hours (which would cover the times commuters would normally use).  Even for those moderately well off, $40,000 a year for commuting would be a significant expense, and not an attractive alternative to MARC with its cost of just one-tenth of this.

If such costs were properly taken into account in the forecasting model, why did it nonetheless predict that most MARC riders would switch to the SCMAGLEV?  This is not fully clear as the model details were not presented in the redacted report, but note that the modelers assigned high dollar amounts for the time value of money ($31.00 to $46.50 for commuters and other non-business travel, and $50.60 to $75.80 for business travel – see page 53 of the Ridership Report).  However, even at such high values, the numbers do not appear to be consistent.  Taking a SCMAGLEV (15 minute trip) rather than MARC (60 minutes) would save 45 minutes each way or 1 1/2 hours a day.  Only at the very high end value of time for business travelers (of $75.80 per hour, or $113.70 for 1 1/2 hours) would this value of time offset the fare difference of $104 (using the average SCMAGLEV fare of $120 minus the MARC fare of $16).  And even that would not suffice for travelers at peak hours (with its SCMAGLEV fare of $160).

But there is also a more basic problem.  It is wrong to assume that travelers on MARC treat their 60 minutes on the train as all wasted time.  They can read, do some work, check their emails, get some sleep, or plan their day.  The presumption that they would pay amounts similar to what some might on average earn in an hour based on their annual salaries is simply incorrect.  And as noted above, if it were correct, then one would see many more riders on the Acela than one does (and similarly riders on the Amtrak Regional trains, that require about 40 minutes for the Washington to Baltimore trip, with an average fare of $34 for a round trip).

There is a similar issue for those who drive.  Those who drive do not place a value on the time spent in their cars equal to what they would earn in an hourly equivalent of their regular salary.  They may well want to avoid traffic jams, which are stressful and frustrating for other reasons, but numerous studies have found that a simple value-of-time calculation based on annual salaries does not explain why so many commuters choose to drive.

4)  Data for the forecasting model also came in part from two personal surveys.  One was an in-person survey of travelers encountered on MARC, at either the MARC BWI Station or onboard Penn Line trains, or at BWI airport.  The other was an online internet survey, where they unfortunately redacted out how they chose possible respondents.

But such surveys are unreliable, with answers that depend critically on how the questions are phrased.  The Final Ridership report does not include the questionnaire itself (most such reports would), so one cannot know what bias there might have been in how the questions were worded.  As an example (and admittedly an exaggerated example, to make the point) were the MARC riders simply asked whether they would prefer a much faster, 15 minute, trip?  Or were they asked whether they would pay an extra $104 per day ($144 at peak hours) to ride a service that would save them 45 minutes each way on the train?

But even such willingness to pay questions are notoriously unreliable.  An appropriate follow-up question to a MARC rider saying they would be willing to pay up to an extra $144 a day to ride a SCMAGLEV, would be why are they evidently not now riding the Acela (at an extra $88 a day) for a ride just 15 minutes longer than what it would be on the SCMAGLEV.

One therefore has to be careful in interpreting and using the results from such a survey in forecasting how travelers would behave.  If current choices (e.g. using the MARC rather than the Acela) do not reflect the responses provided, one should be concerned.

5)  Finally, the particular mathematical form used to model the choices the future travelers would make can make a big difference to the findings.  The Final Ridership Report briefly explains (page 53) that it used a multinomial logit model as the basis for its modeling.  Logit functions assign a continuous probability (starting from 0 and rising to 100%) of some event occurring.  In this model, the event is that a traveler going from one travel zone to another will choose to travel via the SCMAGLEV, or not.  The likelihood of choosing to travel via the SCMAGLEV will be depicted as an S-shaped function, starting at zero and then smoothly rising (following the S-shape) until it reaches 100%, depending on, among other factors, what the travel time savings might be.

The results that such a model will predict will depend critically, of course, on the particular parameters chosen.  But the heavily redacted Final Ridership Report does not show what those parameters were nor how they were chosen or possibly estimated, nor even the complete set of variables used in that function.  The report says little (in what remains after the redactions) beyond that they used that functional form.

A feature of such logit models is that while the choices are discrete (one either will ride the SCMAGLEV or will not), it allows for “fuzziness” around the turning points, that recognize that between individuals, even if they confront a similar combination of variables (a combination of cost, travel time, and other measured attributes), some will simply prefer to drive while some will prefer to take the train.  That is how people are.  But then, while a higher share might prefer to take a train (or the SCMAGLEV) when travel times fall (by close to 45 minutes with the SCMAGLEV when compared to their single “train” option that is 90% MARC, and by variable amounts for those who drive depending on the travel zone pairs), how much higher that share will be will depend on the parameters they selected for their logit.

With certain parameters, the responses can be sensitive to even small reductions in travel times, and the predicted resulting shifts then large.  But are those parameters reasonable?  As noted previously, a test would have been whether the model, with the parameters chosen, would have predicted accurately the number of riders actually observed on the Acela trains in the base year.  But it does not appear such a test was done.  At least no such results were reported to test whether the model was validated or not.

Thus there are a number of possible reasons why the forecast ridership on the SCMAGLEV differs so much from what one currently observes for ridership on the Acela, and from what one might reasonably expect Acela ridership to be in the future.  It is not possible to say whether these are indeed the reasons why the SCMAGLEV forecasts are so incredibly out of line with what one observes for the Acela.  There may be, and indeed likely are, other reasons as well.  But due to issues such as those outlined here, one can understand the possible factors behind SCMAGLEV ridership forecasts that deviate so markedly from plausibility.

D.  Conclusion

The ridership forecasts for the SCMAGLEV are vastly over-estimated.  Predicted ridership on the SCMAGLEV is a minimum of two, and up to three, orders of magnitude higher than what has been observed on, and can reasonably be forecast for, the Acela.  One should not be getting predicted ridership that is more than 100 times what one observes on a comparable, existing (and thus knowable), service.

With ridership on the proposed system far less than what the project sponsors have forecast, the case for building the SCMAGLEV collapses.  Operational and maintenance costs would not be covered, much less any possibility of paying back a portion of the billions of dollars spent to build it, nor will the purported economic benefits follow.

However, the harm to the environment will have been done.  Even if the system is then shut down (due to the forecast ridership never materializing), it will not be possible to reverse much of that environmental damage.

The US very much needs to improve its public transit.  It is far too difficult, with resulting harm both to the economy and to the population, to move around in the Baltimore-Washington region.  But fixing this will require a focus on the basic nuts and bolts of operating, maintaining, and investing in the transit systems we have, including the trains and buses.  This might not look as attractive as a magnetically levitating train, but will be of benefit.  And it will be of benefit to the general public – in particular to those who rely on public transit – and not just to a narrow elite that can afford $120 fares.  Money for public transit is scarce.  It should not be wasted on shiny new toys.

Was Sturgis a Covid-19 Superspreader Event?: Evidence Suggests That It May Well Have Been

A.  Introduction

The Sturgis Motorcycle Rally is an annual 10-day event for motorcycle enthusiasts (in particular of Harley-Davidsons), held in the normally small town in far western South Dakota of Sturgis.  It was held again this year, from August 7 to August 16, despite the Covid-19 pandemic, and drew an estimated 460,000 participants.  Motorcyclists gather from around the country for lots of riding, lots of music, and lots of beer and partying.  And then they go home.  Cell phone data indicate that fully 61% of all the counties in the US were visited by someone who attended Sturgis this year.

Due to the pandemic, the town debated whether to host the event this year.  But after some discussion, it was decided to go ahead.  And it is not clear that town officials could have stopped it even if they wanted.  Riders would likely have shown up anyway.

Despite the on-going covid pandemic, masks were rarely seen.  Indeed, many of those attending were proud in their defiance of the standard health guidelines that masks should be worn and social distancing respected, and especially so in such crowded events.  T-shirts were sold, for example, declaring “Screw Covid-19, I Went to Sturgis”.

Did Sturgis lead to a surge in Covid-19 cases?  Unfortunately, we do not have direct data on this because the identification of the possible sources of someone’s Covid-19 infection is incredibly poor in the US.  There is little investigation of where someone might have picked up the virus, and far from adequate contact tracing.  And indeed, even those who attended the rally and later came down with Covid-19 found that their state health officials were often not terribly interested in whether they had been at Sturgis.  The systems were simply not set up to incorporate this.  And those attending who were later sick with the disease were also not always open on where they had been, given the stigma.

One is therefore left only with anecdotal cases and indirect evidence.  Recent articles in the Washington Post and the New York Times were good reports, but could only cover a number of specific, anecdotal, cases, as well as describe the party environment at Sturgis.  One can, however, examine indirect evidence.  It is reasonable to assume that those motorcycle enthusiasts who had a shorter distance to get to Sturgis from their homes would be more likely to go.  Hence near-by states would account for a higher share (adjusted for population) of those attending Sturgis and then returning home than would be the case for states farther away.  If so, then if Covid-19 was indeed spread among those attending Sturgis, one would see a greater degree of seeding of the virus that causes Covid-19 in the near-by states than would be the case among states that are farther away.  And those near-by states would then have more of a subsequent rise in Covid-19 cases as the infectious disease spread from person to person than one would see in states further away.

This post will examine this, starting with the chart at the top of this post.  As is clear in that chart, by early November states geographically closer to Sturgis had far higher cases of Covid-19 (as a share of their population) than those further away.  And the incidence fell steadily with geographic distance, in a relationship that is astonishingly tight.  Simply knowing the distance of the state from Sturgis would allow for a very good prediction (relative to the national average) of the number of daily new confirmed cases of Covid-19 (per 100,000 of population) in the 7-day period ending November 6.

A first question to ask is whether this pattern developed only after Sturgis.  If it had been there all along, including before the rally was held, then one cannot attribute it to the rally.  But we will see below that there was no such relationship in early August, before the rally, and that it then developed progressively in the months following.  This is what one would expect if the virus had been seeded by those returning from Sturgis, who then may have given this infectious disease to their friends and loved ones, to their co-workers, to the clerks at the supermarkets, and so on, and then each of these similarly spreading it on to others in an exponentially increasing number of cases.

To keep things simple in the charts, we will present them in a standard linear form.  But one may have noticed in the chart above that the line in black (the linear regression line) that provides the best fit (in a statistical sense) for a straight line to the scatter of points, does not work that well at the two extremes.  The points at the extremes (for very short distances and very long ones) are generally above the curve, while the points are often below in the middle range.  This is the pattern one would expect when what matters to the decision to ride to the rally is not some increment for a given distance (of an extra 100 miles, say), but rather for a given percentage increase (an extra 10%, say).  In such cases, a logarithmic curve rather than a straight (linear) line will fit the data better, and we will see below that indeed it does here.  And this will be useful in some statistical regression analysis that will examine possible explanations for the pattern.

It should be kept in mind, however, that what is being examined here are correlations, and being correlations one can not say with certainty that the cause was necessarily the Sturgis rally.  And we obviously cannot run this experiment over repeatedly in a lab, under varying conditions, to see whether the result would always follow.

Might there be some other explanation?  Certainly there could be.   Probably the most obvious alternative is that the surge in Covid-19 cases in the upper mid-west of the US between September and early November might have been due to the onset of cold weather, where the states close to Sturgis are among the first to turn cold as winter approaches in the US.  We will examine this below.  There is, indeed, a correlation, but also a number of counter-examples (with states that also turned colder, such as Maine and Vermont, that did not see such a surge in cases).  The statistical fit is also not nearly as good.

One can also examine what happened across the border in the neighboring provinces of Canada.  The weather there also turned colder in September and October, and indeed by more than in the upper mid-west of the US.  Yet the incidence of Covid-19 cases in those provinces was far less.

What would explain this?  The answer is that it is not cold weather per se that leads to the virus being spread, but rather cold weather in situations where socially responsible behavior is not being followed – most importantly mask-wearing, but also social distancing, avoidance of indoor settings conducive to the spread of the virus, and so on.  As examined in the previous post on this blog, mask-wearing is extremely powerful in limiting the spread of the virus that causes Covid-19.  But if many do not wear masks, for whatever reason, the virus will spread.  And this will be especially so as the weather turns colder and people spend more time indoors with others.

This could lead to the results seen if states that are geographically closer to Sturgis also have populations that are less likely to wear masks when they go out in public.  And we will see that this was likely indeed a factor.  For whatever reason (likely political, as the near-by states are states with high shares of Trump supporters), states geographically close to Sturgis have a generally lower share of their populations regularly wearing masks in this pandemic.  But the combination of low mask-wearing and falling temperatures (what statisticians call an interaction effect) was supplemental to, and not a replacement of, the impact of distance from Sturgis.  The distance factor remained highly significant and strong, including when controlling for October temperatures and mask-wearing, consistent with the view that Sturgis acted as a seeding event.

This post will take up each of these topics in turn.

B.  Distance to Sturgis vs. Daily New Cases of Covid-19 in the Week Ending November 6

The chart at the top of this post plots the average daily number of confirmed new cases of Covid-19 over the 7-day period ending November 6 in a state (per 100,000 of population), against the distance to Sturgis.  The data for the number of new cases each day was obtained from USAFacts, which in turn obtained the data from state health authorities.  The data on distance to Sturgis was obtained from the directions feature on Google Maps, with Sturgis being the destination and the trip origin being each of the 48 states in the mainland US (Hawaii and Alaska were excluded), plus Washington, DC.  Each state was simply entered (rather than a particular address within a state), and Google Maps then defaulted to a central location in each state.  The distance chosen was then for the route recommended by Google, in miles and on the roads recommended.  That is, these are trip miles and not miles “as the crow flies”.

When this is done, with a regular linear scale used for the mileage on the recommended routes, one obtains the chart at the top of this post.  For the week ending November 6, those states closest to Sturgis saw the highest rates of Covid-19 new cases (130 per 100,000 of population in South Dakota itself, where Sturgis is in the far western part of the state, and 200 per 100,000 in North Dakota, where one should note that Sturgis is closer to some of the main population centers of North Dakota than it is to some of the main population centers of South Dakota).  And as one goes further away geographically, the average daily number of new cases falls substantially, to only around one-tenth as much in several of the states on the Atlantic.

The model is a simple one:  The further away a state is from Sturgis, the lower its rate (per 100,000 of population) of Covid-19 new cases in the first week of November.  But it fits extremely well even though it looks at only one possible factor (distance to Sturgis).  The straight black line in the chart is the linear regression line that best fits, statistically, the scatter of points.  A statistical measure of the fit is called the R-squared, which varies between 0% and 100% and measures what share of the variation observed in the variable shown on the vertical axis of the chart (the daily new cases of Covid-19) can be predicted simply by knowing the regression line and the variable shown on the horizontal axis (the miles to Sturgis).

The R-squared for the regression line calculated for this chart was surprisingly high, at 60%.  This is astonishing.  It says that if all we knew was this regression line, then we could have predicted 60% of the variation in Covid-19 cases across states in the week ending November 6 simply by knowing how far the states are from Sturgis.  States differ in numerous ways that will affect the incidence of Covid-19 cases in their territory.  Yet here, if we know just the distance to Sturgis, we can predict 60% of how Covid-19 incidence will vary across the states.  Regressions such as these are called cross-section regressions (the data here are across states), and such R-squares are rarely higher than 20%, or at most perhaps 30%.

But as was discussed above in the introduction, trip decisions involving distances often work better (fit the data better) when the scale used is logarithmic.  On a logarithmic scale, what enters into the decision to make the trip of not is not some fixed increment of distance (e.g. an extra 100 miles) but rather some proportional change (e.g. an extra 10%).  A statistical regression can then be estimated using the logarithms of the distances, and when this estimated line is re-calculated back on to the standard linear scale, one will have the curve shown in blue in the chart:

The logarithmic (or log) regression line (in blue) fits the data even better than the simple linear regression line (in black), including at the two extremes (very short and very long distances).  And the R-squared rises to 71% from the already quite high 60% of the linear regression line.  The only significant outlier is North Dakota.  If one excludes North Dakota, the R-squared rises to 77%.  These are remarkably high for a cross-section analysis.

This simple model therefore fits the data well, indeed extremely well.  But there are still several issues to consider, starting with whether there was a similar pattern across the states before the Sturgis rally.

C.  Distance to Sturgis vs. Daily New Cases of Covid-19 in the Week Ending August 6, and the Progression in Subsequent Months

The Sturgis rally began on August 7.  Was there possibly a similar pattern as that found above in Covid-19 cases before the rally?  The answer is a clear no:

In the week ending August 6, the relationship of Covid-19 cases to distance from Sturgis was about as close to random as one can ever find.  If anything, the incidences of Covid-19 cases in the 10 or so states closest to Sturgis were relatively low.  And for all 48 states of the Continental US (plus Washington, DC), the simple linear regression line is close to flat, with an R-squared of just 0.4%.  This is basically nothing, and is in sharp contrast to the R-squared for the week ending November 6 of 60% (and 71% in logarithmic terms).

One should also note the magnitudes on the vertical scale here.  They range from 0 to 40 cases (per 100,000 of population) per day in the 7-day period.  In the chart for cases in the 7-day period ending on November 6 (as at the top of this post), the scale goes from 0 to 200.  That is, the incidence of Covid-19 cases was relatively low across US states in August (relative to what it was later in parts of the US).  That then changed in the subsequent months.  Furthermore, one can see in the charts above for the week ending November 6 that the states further than around 1,400 miles from Sturgis still had Covid new case rates of 40 per day or less.  That is, the case incidence rates remained in that 0 to 40 range between August and early November for the states far from Sturgis.  The states where the rates rose above this were all closer to Sturgis.

There was also a steady progression in the case rates in the months from August to November, focused on the states closer to Sturgis, as can be seen in the following chart:

Each line is the linear regression line found by regressing the number of Covid-19 cases in each state (per 100,000 of population) for the week ending August 6, the week ending September 6, the week ending October 6, and the week ending November 6, against the geographic distance to Sturgis.  The regression lines for the week ending August 6 and the week ending November 6 are the same as discussed already in the respective charts above.  The September and October ones are new.

As noted before, the August 6 line is essentially flat.  That is, the distance to Sturgis made no difference to the number of cases, and they are also all relatively low.  But then the line starts to twist upwards, with the right end (for the states furthest from Sturgis) more or less fixed and staying low, while the left end rotated upwards.  The rotation is relatively modest for the week ending September 6, is more substantial in the month later for the week ending October 6, and then the largest in the month after that for the week ending November 6.  This is precisely the path one would expect to find with an exponential spread of an infectious disease that has been seeded but then not brought under effective control.

D.  Might Falling Temperatures Account for the Pattern?

The charts above are consistent with Sturgis acting as a seeding event that later then led to increases in Covid-19 cases that were especially high in near-by states.  But one needs to recognize that these are just correlations, and by themselves cannot prove that Sturgis was the cause.  There might be some alternative explanation.

One obvious alternative would be that the sharp increase in cases in the upper mid-west of the US in this period was due to falling temperatures, as the northern hemisphere winter approached.  These areas generally grow colder earlier than in other parts of the US.  And if one plots the state-wide average temperatures in October (as reported by NOAA) against the average number of Covid-19 cases per day in the week ending November 6 one indeed finds:

There is a clear downward trend:  States with lower average temperatures in October had more cases (per 100,000 of population) in the week ending November 6.  The relationship is not nearly as tight as that found for the one based on geographic distance from Sturgis (the R-squared is 35% here, versus 60% for the linear relationship based on distance), but 35% is still respectable for a cross-state regression such as this.

However, there are some counterexamples.  The average October temperatures in Maine and Vermont were colder than all but 7 or 10 states (for Maine and Vermont, respectively), yet their Covid-19 case rates were the two lowest in the country.

More telling, one can compare the rates in North and South Dakota (with the two highest Covid-19 rates in the country in the week ending November 6) plus Montana (adjacent and also high) with the rates seen in the Canadian provinces immediately to their north:

The rates are not even close.  The Canadian rates were all far below those in the US states to their south.  The rate in North Dakota was fully 30 times higher than the rate in Saskatchewan, the Canadian province just to its north.  There is clearly something more than just temperature involved.

E.  The Impact of Wearing Masks, and Its Interaction With Temperature

That something is the actions followed by the state or provincial populations to limit the spread of the virus.  The most important is the wearing of masks, which has proven to be highly effective in limiting the spread of this infectious disease, in particular when complemented with other socially responsible behaviors such as social distancing, avoiding large crowds (especially where many do not wear masks), washing hands, and so on.  Canadians have been far more serious in following such practices than many Americans.  The result has been far fewer cases of Covid-19 (as a share of the population) in Canada than in the US, and far fewer deaths.

Mask wearing matters, and could be an alternative explanation for why states closer to Sturgis saw higher rates of Covid-19 cases.  If a relatively low share of the populations in the states closer to Sturgis wear masks, then this may account for the higher incidence of Covid-19 cases in those near-by states.  That is, perhaps the states that are geographically closer to Sturgis just happen also to be states where a relatively low share of their populations wear masks, with this then possibly accounting for the higher incidence of cases in those states.

However, mask-wearing (or the lack of it), by itself, would be unlikely to fully account for the pattern seen here.  Two things should be noted.  First, while states that are geographically closer to Sturgis do indeed see a lower share of their population generally wearing masks when out in public, the relationship to this geography is not as strong as the other relationships we have examined:

The data in the chart for the share who wear masks by state come from the COVIDCast project at Carnegie Mellon University, and was discussed in the previous post on this blog.  The relationship found is indeed a positive one (states geographically further from Sturgis generally have a higher share of their populations wearing masks), but there is a good deal of dispersion in the figures and the R-squared is only 27.5%.  This, by itself, is unlikely to explain the Covid-19 rates across states in early November.

Second, and more importantly:  While the states closer to Sturgis generally have a lower share of mask-wearing, this would not explain why one did not see similarly higher rates of Covid-19 incidence in those states in August.  Mask-wearing was likely similar.  The question is why did Covid-19 incidence rise in those states between August (following the Sturgis rally) and November, and not simply why they were high in those states in November.

However, mask-wearing may well have been a factor.  But rather than accounting for the pattern all by itself, it may have had an indirect effect.  With the onset of colder weather, more time would be spent with others indoors, and wearing a mask when in public is particularly important in such settings.  That is, it is the combination of both a low share of the population wearing masks and the onset of colder weather which is important, not just one or the other.

These are called interaction effects, and investigating them requires more than can be depicted in simple charts.  Multiple regression analysis (regression analysis with several variables – not just one as in the charts above) can allow for this.  Since it is a bit technical, I have relegated a more detailed discussion of these results to a Technical Annex at the conclusion of this post for those who are interested.

Briefly, a regression was estimated that includes miles from Sturgis, average October temperatures, the share who wear masks when out in public, plus an interaction effect between the share wearing masks and October temperatures, all as independent variables affecting the observed Covid-19 case rates of the week ending November 6.  And this regression works quite well.  The R-squared is 75.4%, and each of the variables (including the interaction term) are either highly significant (miles from Sturgis) or marginally so (a confidence level of between 6 and 8% for the variables, which is slightly worse than the 5% confidence level commonly used, but not by much).

Note in particular that the interaction term matters, and matters even while each of the other variables (miles to Sturgis, October temperatures, and mask-wearing) are taken into account individually as well.  In the interaction term, it is not simply the October temperatures or the share wearing masks that matter, but the two acting together.  That is, the impact of relatively low temperatures in October will matter more in those states where mask-wearing is low than they would in states where mask-wearing is high.  If people generally wore masks when out in public (and followed also the other socially responsible behaviors that go along with it), the falling temperatures would not matter as much.  But when they don’t, the falling temperatures matter more.

From this overall regression equation, one can also use the coefficients found to estimate what the impact would be of small changes in each of the variables.  These are called elasticities, and based on the estimated equation (and computing the changes around the sample means for each of the variables):  a 1% reduction in the number of miles from Sturgis would lead to a 1.0% rise in the incidence of Covid-19 cases; a 1% reduction (not a 1 percentage point increase, but rather a 1% reduction from the sample mean) in the share of the population wearing masks when out in public would lead to a 1.7% rise in the incidence of Covid-19 cases; and a 1% reduction in the average October temperature across the different states would lead to a 1.2% rise in the incidence of Covid-19 cases.  All of these elasticity estimates look quite plausible.

These results are consistent with an explanation where the Sturgis rally acted as a significant superspreader event that led to increased seeding of the virus in the locales, in near-by states especially. This then led to significant increases in the incidence of Covid-19 cases in the different states as this infectious disease spread to friends and family and others in the subsequent months, and again especially in the states closest to Sturgis.  Those increases were highest in the states that grew colder earlier than others when the populations wearing masks regularly in those states was relatively low.  That is, the interaction of the two mattered.  But even with this effect controlled for, along with controlling also for the impact of colder temperatures and for the impact of mask-wearing, the impact of miles to Sturgis remained and was highly significant statistically.

F.  Conclusion

As noted above, the analysis here cannot and does not prove that the Sturgis rally acted as a superspreader event.  There was only one Sturgis rally this year, one cannot run repeated experiments of such a rally under various alternative conditions, and the evidence we have are simply correlations of various kinds.  It is possible that there may be some alternative explanation for why Covid-19 cases started to rise sharply in the weeks after the rally in the states closest to Sturgis.  It is also possible it is all just a coincidence.

But the evidence is consistent with what researchers have already found on how the virus that causes Covid-19 is spread.  Studies have found that as few as 10% of those infected may account for 80% of those subsequently infected with the virus.  And it is not just the biology of the disease and how a person reacts to it, but also whether the individual is then in situations with the right conditions to spread it on to others.  These might be as small as family gatherings, or as large as big rallies.  When large numbers of participants are involved, such events have been labeled superspreader events.

Among the most important of conditions that matter is whether most or all of those attending are wearing masks.  It also matters how close people are to each other, whether they are cheering, shouting, or singing, and whether the event is indoors or outdoors.  And the likelihood that an attendee who is infectious might be there increases exponentially with the number of attendees, so the size of the gathering very much matters.

A number of recent White House events matched these conditions, and a significant number of attendees soon after tested positive for Covid-19.  In particular, about 150 attended the celebration on September 26 announcing that Amy Coney Barrett would be nominated to the Supreme Court to take the seat of the recently deceased Ruth Bader Ginsburg.  Few wore masks, and at least 18 attendees later tested positive for the virus.  And about 200 attended an election night gathering at the White House.  At least 6 of those attending later tested positive.  While one can never say for sure where someone may have contracted the virus, such clusters among those attending such events are very unlikely unless the event was where they got the virus.  It is also likely that these figures are undercounts, as White House staff have been told not to let it become publicly known if they come down with the virus.  Finally, as of November 13 at least 30 uniformed Secret Service officers, responsible for security at the White House, have tested positive for the coronavirus in the preceding few weeks.

There is also increasing evidence that the Trump campaign rallies of recent months led to subsequent increases in Covid-19 cases in the local areas where they were held.  These ranged from studies of individual rallies (such as 23 specific cases traced to three Trump rallies in Minnesota in September), to a relatively simple analysis that looked at the correlation between where Trump campaign rallies were held and subsequent increases in Covid-19 cases in that locale, to a rigorous academic study that examined the impact of 18 Trump campaign rallies on the local spread of Covid-19.  This academic study was prepared by four members of the Department of Economics at Stanford (including the current department chair, Professor B. Douglas Bernheim).  They concluded that the 18 Trump rallies led to an estimated extra 30,000 Covid-19 cases in the US, and 700 additional deaths.

One should expect that the Sturgis rally would act as even more of a superspreader event than those campaign rallies.  An estimated 460,000 motorcyclists attended the Sturgis rally, while the campaign rallies involved at most a few thousand at each.  Those at the Sturgis rally could also attend for up to ten days; the campaign rallies lasted only a few hours.  Finally, there would be a good deal of mixing of attendees at the multiple parties and other events at Sturgis.  At a campaign rally, in contrast, people would sit or stand at one location only, and hence only be exposed to those in their immediate vicinity.

The results are also consistent with a rigorous academic study of the more immediate impact of the Sturgis rally on the spread of Covid-19, by Professor Joseph Sabia of San Diego State University and three co-authors.  Using anonymous cell phone tracking data, they found that counties across the US that received the highest inflows of returning participants from the Sturgis rally saw, in the immediate weeks following the rally (up to September 2), an increase of 7.0 to 12.5% in the number of Covid-19 cases relative to the counties that did not contribute inflows.  But their study (issued as a working paper in September) looked only at the impact in the immediate few weeks following Sturgis.  They did not consider what such seeding might then have led to.  The results examined in the analysis here, which is longer-term (up to November 6), are consistent with their findings.

It is therefore fully plausible that the Sturgis rally acted as a superspreader event.  And the evidence examined in this post supports such a conclusion.  While one cannot prove this in a scientific sense, as noted above, the likelihood looks high.

Finally, as I finish writing this, the number of deaths in the US from this terrible virus has just surpassed 250,000.  The number of confirmed cases has reached 11.6 million, with this figure rising by 1 million in just the past week.  A tremendous surge is underway, far surpassing the initial wave in March and April (when the country was slow to discover how serious the spread was, due in part to the botched development in the US of testing for the virus), and far surpassing also the second, and larger, wave in June and July (when a number of states, in particular in the South and Southwest, re-opened too early and without adequate measures, such as mask mandates, to keep the disease under control).  Daily new Covid-19 cases are now close to 2 1/2 times what they were at their peak in July.

This map, published by the New York Times (and updated several times a day) shows how bad this has become.  It is also revealing that the worst parts of the country (the states with the highest number of cases per 100,000 of population) are precisely the states geographically closest to Sturgis.  There is certainly more behind this than just the Sturgis rally.  But it is highly likely the Sturgis rally was a significant contributor.  And it is extremely important if more cases are to be averted to understand and recognize the possible role of events such as the rally at Sturgis.

Average Daily Cases of Covid-19 per 100,000 Population

7-Day Average for Week Ending November 18, 2020

Source:  The New York Times, “Covid in the US:  Latest Map and Case Count”.  Image from November 19, with data as of 8:14 am.

 


Technical Annex:  Regression Results

As discussed in the text, a series of regressions were estimated to explore the relationship between the Sturgis rally and the incidence of Covid-19 cases (the 7-day average of confirmed new cases in the week ending November 6) across the states of the mainland US plus Washington, DC.  Five will be reported here, with regressions on the incidence of Covid-19 cases (as the dependent variable) as a function of various combinations of three independent variables: miles from Sturgis (in terms of their natural logarithms), the average state-wide temperature in October (also in terms of their natural logarithms), and the share of the population in the respective states who reported they always or most of the time wore masks when out in public.  Three of the five regressions are on each of the three independent variables individually, one on the three together, and one on the three together along with an interaction effect measured by multiplying the October temperature variable (in logs) with the share wearing masks.  The sources for each variable were discussed above in the main text.

The basic results, with each regression by column, are summarized in the following table:

Regressions on State Covid-9 Cases – November 6

     Miles to Sturgis and Temperatures are in natural logs

Miles only

Temp only

Masks only

Miles, Temp, &Masks

All with Interaction

Miles to Sturgis

Slope

-54.9

-41.9

-36.6

t-statistic

-10.7

-5.2

-4.3

Avg Temperature

Slope

-133.3

-45.5

-516.8

t-statistic

-5.5

-2.0

-1.9

Share Wear Masks

Slope

-3.1

-0.8

-22.4

t-statistic

-3.9

-1.3

-1.8

Interaction Temp & Masks

Slope

5.44

t-statistic

1.8

Intercept

425.5

572.5

309.4

582.5

2,422.5

t-statistic

11.9

6.0

4.5

7.1

2.3

R-squared

71.0%

39.4%

24.2%

73.7%

75.4%

In the regressions with each independent variable taken individually, all the coefficients (slopes) found are highly significant.  The general rule of thumb is that a confidence level of 5% is adequate to call the relationship statistically “significant” (i.e. that the estimated coefficient would not differ from zero just due to random variation in the data).  A t-statistic of 2.0 or higher, in a large sample, would signal significance at least at a 5% confidence level (that is, that the estimated coefficient differs from zero at least 95% of the time), and the t-statistics are each well in excess of 2.0 in each of the single-variable regressions.  The R-squared is quite high, at 71.0%, for the regression on miles from Sturgis, but more modest in the other two (39.4% and 24.2% for October temperature and mask-wearing, respectively).

The estimated coefficients (slopes) are also all negative.  That is, the incidence of Covid-19 goes down with additional miles from Sturgis, with higher October temperatures, and with higher mask-wearing.  The actual coefficients themselves should not be compared to each other for their relative magnitudes.  Their size will depend on the units used for the individual measures (e.g. miles for distance, rather than feet or kilometers; or temperature measured on the Fahrenheit scale rather than Centigrade; or shares expressed as, say, 80 for 80% instead of 0.80).  The units chosen will not matter.  Rather, what is of interest is how the predicted incidence of Covid-19 changes when there is, say, a 1% change in any of the independent variables.  These are elasticities and will be discussed below.

In the fourth regression equation (the fourth column), where the three independent variables are all included, the statistical significance of the mask-wearing variable drops to a t-statistic of just 1.3.  The significance of the temperature variable also falls to 2.0, which is at the borderline for the general rule of thumb of 5% confidence level for statistical significance.  The miles from Sturgis variable remains highly significant (its t-statistic also fell, but remains extremely high).  If one stopped here, it would appear that what matters is distance from Sturgis (consistent with Sturgis acting as a seeding event), coupled with October temperatures falling (so that the thus seeded virus spread fastest where temperatures had fallen the most).

But as was discussed above in the main text, there is good reason to view the temperature variable acting not solely by itself, but in an interaction with whether masks are generally worn or not.  This is tested in the fifth regression, where the three individual variables are included along with an interaction term between temperatures and mask-wearing.  The temperature, mask-wearing, and interaction variables now all have a similar level of significance, although at just less than 5% (at 6% to 8% for each).  While not quite 5%, keep in mind that the 5% is just a rule of thumb.  Note also that the positive sign on the interaction term (the 5.44) is an indication of curvature.  The positive sign, coupled with the negative signs for the temperature and mask-wearing variables taken alone, indicates that the curves are concave facing upwards (the effects of temperature and mask-wearing diminish at the margin at higher values for the variables).  Finally, the miles to Sturgis variable remains highly significant.

Based on this fifth regression equation, with the interaction term allowed for, what would be the estimated response of Covid-19 cases to changes in any of the independent variables (miles to Sturgis, October temperatures, and mask-wearing)?  These are normally presented as elasticities, with the predicted percentage change in Covid-19 cases when one assumes a small (1%) change in any of the independent variables.  In a mixed equation such as this, where some terms are linear and some logarithmic (plus an interaction term), the resulting percentage change can vary depending on the starting point is chosen.  The conventional starting point taken is normally the sample means, and that will be done here.

Also, I have expressed the elasticities here in terms of a 1% decrease in each of the independent variables (since our interest is in what might lead to higher rates of Covid-19 incidence):

Elasticities from Full Equation with Interaction Term

      Percent Increase in Number of Covid-19 Cases from a 1% Decrease Around Sample Means

Elasticity

Miles to Sturgis

1.02%

October Temperature

1.16%

Share Wearing Masks

1.69%

All these estimated elasticities are quite plausible.  If one is 1% closer in geographic distance to Sturgis (starting at the sample mean, and with the other two variables of October temperature and mask-wearing also at their respective sample means), the incidence of Covid-19 cases (per 100,000 of population) as of the week ending November 6 would increase by an estimated 1.02%.  A 1% lower October temperature (from the sample mean) would lead to an estimated 1.16% increase in Covid-19 cases.  And the impact of the share wearing masks is important and stronger, where a 1% reduction in the share wearing masks would lead to an estimated 1.69% increase in cases, with all the other factors here taken into account and controlled for.

These results are consistent with a conclusion that the Sturgis rally led to a significant seeding of cases, especially in near-by states, with the number of infections then growing over time as the disease spread.  The cases grew faster in those states where mask-wearing was relatively low, and in states with lower temperatures in October (leading people to spend more time indoors).  When the falling temperatures were coupled with a lower share (than elsewhere) of the population wearing masks, the rate of Covid-19 cases rose especially fast.