The World Bank Doing Business Report: Changes in Rank May Not Mean What They May Appear to Mean

A.  The Issue

The World Bank has been publishing its Doing Business report annually, from the first released in September 2003 (and titled Doing Business in 2004) until the one released in October 2019 (and titled Doing Business 2020).  It has always been a controversial report, criticized for a number of different reasons.  But it has also been one that gained a good deal of attention – from the news media, from investors, and from at least certain segments of the public.  And because it received a good deal of attention (indeed more than any other World Bank report), governments paid attention to what it said, especially about them.

It is not my intention here to review those criticisms.  They are numerous.  Rather, in this post I will present a different approach to how the results could have been presented, which would have been more informative and which might have quieted some (but not all) of the criticisms.

One feature of the report, which was also the most widely discussed aspect of the report, was its ranking of countries in terms of their (assessed) environment for doing business. Because the Doing Business reports were widely cited, countries paid attention to those rankings and by various methods sought to improve their rankings.  These were often positive and productive, with countries seeking to simplify business regulation so that businesses could operate more effectively.  While some criticized this as simplistic, it is difficult to see a rationale, for example, justifying that in Venezuela (in 2020), to start a business would require completing 20 different procedures normally requiring (given the times for review and other requirements) an estimated 230 days.  In contrast, in New Zealand, starting a business involves only one procedure and can be completed in half a day.  Of course, few new businesses in Venezuela actually take 230 days to get started.  Either they are simply ignored (with bribes then paid to the police to leave them alone despite their violation of the regulations), or bribes are paid to obtain the licenses without going through the complex processes.

Given the prominence of the report, some countries undertook to improve their rankings through less positive means.  Sometimes the system could be gamed in various ways (e.g. to enact some “reform”, but then to limit its application narrowly so as to fit the letter, but not the spirit, of what was being assessed).  Or sometimes pressure could be applied on those submitting the data to slant it in a favorable direction.

And sometimes a country would try to apply political pressure on World Bank management in order to secure more favorable treatment.  A recently released independent investigation (commissioned by the World Bank Board, and undertaken by the law firm WilmerHale), found that there was such pressure (or at least perceived such pressure) brought by China on Bank management to ensure its ranking in the 2018 report would not fall.  At the instigation of the president’s offiice, and then overseen by the then #2 at the World Bank (Kristalina Georgieva – now the head of the IMF), Bank staff were directed to re-examine the ratings that had been assigned to China, and indeed re-examine the methodology more broadly, to see whether with some set of changes China’s ranking would not fall.  According to the WilmerHale report, several different approaches were considered and tried until one was found which would lead to the desired outcome.  In the almost final draft of the report, just before its planned publication, China’s ranking would have fallen from #78 in 2017 to #85 in 2018 – a fall of 7 places.  But with the last minute changes overseen by Georgieva, China’s ranking in 2018 became #78 – the same as in 2017.  And that was what was then published.

B.  An Alternative Approach

This focus on rankings is misguided, although not surprising given how the results are presented.  A country may well have enacted significant measures improving its business environment, but if countries a bit below it in the previous year’s rankings did even more, then the country could see a fall in its ranking despite the reforms it had undertaken.  And that fall in ranking would then often be interpreted in the news media (as well as by others) as if the business environment had deteriorated.  In this example, it had not.  Rather, it just did not improve as much as others.  But this nuance could easily be missed, and often was.

A different presentation in the Doing Business reports could have addressed this, and might have made the report a bit less controversial.  Rankings, by their nature, are a zero-sum game, where a rise in the ranking of one country means a fall in the ranking of another.  While this is needed in a sports league, where a tiny difference in the seasonal won/loss record can determine the ranking and hence who goes to the playoffs, it is not the same for the business environment.  Investors want to know what is good and what is not so good in an absolute sense, and small differences in rankings are of no great consequence.  And while some might argue that countries are competing in a global market for investor interest, there are far more important issues for any investor than some small difference in rankings.  For example, China and Malta were ranked similarly for several years in the mid-2010s, but whether one was a bit above or a bit below the other would be basically irrelevant to an investor.

An alternative presentation of the findings, based on comparison to an absolute rather than relative scale, would have been more meaningful as well as less controversial.  Specifically, countries could have been compared not against each other in every given year to see if their rankings had moved up or down, but rather to what the set of scores were (and consequent rankings were) in some base year.  That is, one would see whether their business environment had gotten better or worse in terms of the base year set of scores and consequent rankings (what one could call that base year’s “ladder” of scores).  The Doing Business project had that data, reported the underlying scores, and used those scores to produce its rankings.  But there was no systematic presentation of how the business environments may have changed over time, with this then compared to what the rankings were in some base period.

There were, I should note, figures provided in the Doing Business reports on the one-year changes in scores for each individual country, but few paid much attention to them.  They were for one-year changes only, and did not show the more meaningful cumulative changes over a number of years.  Nor did they then show how such changes would affect the country’s position on an understandable scale, such as what the ratings were across countries in a base year.

In this new approach, the comparison would be to an absolute scale (the scores and consequent rankings in the base year), not a relative ranking in each future period.  The issue is analogous to different types of poverty measures.  Sometimes poverty in a country is measured as the bottom 10% of the population (ranked by income).  This is a useful measure for certain things (such as how to target various social programs), but will of course always show 10% of the population as being “poor” regardless of how effective the social programs may have been.  In contrast, one could have an absolute measure of poverty (for many years the World Bank used the measure of $1 per day of income per person), and one could then track how many people were moving out of poverty (or not) by this measure.  Similarly, with the relative ranking of the 190 countries covered in the Doing Business reports, one will always find a mean ranking of 95, regardless of what countries may have done to improve their business environment.  It is, obviously, simply a relative measure, and not terribly useful in conveying what has been happening to the business environment in any given country.  Yet everyone focuses on it.

Any comparisons over time of these scores must also be over periods where the methodological approach used (precisely what is measured, and what weights are assigned to those measures) has not changed.  Otherwise one is comparing apples to oranges, where changes in the scores may reflect the methodological changes and not necessarily changes in the business environment of the countries.  Such changes in the methodological specifics have been criticized by some, as such changes in methodology will, in itself, lead to changes in rankings.  But one should expect periodic changes in the methodological approach used, as experience is gained and more is learned.

C.  The Results

The 2016 to 2020 period was one where the methodological specifics used in the Doing Business reports did not change, and hence one can make a meaningful comparison of scores over this period.  The data needed are what they specifically call the “Doing Business Scores”, which are the absolute values for the indices used to measure the business environment.  They range from 100 (for the best possible) to 0 (for the worst).  These scores have (at least until now) been made publicly available in the online Doing Business database.  I used these to illustrate what could be done to present the Doing Business results based on an absolute, rather than relative, scale.

The results are presented in a table at the end of this post, and readers might want to take a brief look at that table now to see its structure.  The base year is 2016, and the calculations are based on the changes in the country’s Doing Business Scores over the period from 2016 to 2020.  No country scores 100, and the top-ranked country (New Zealand) had an overall score of 87.1 in 2016 and slightly less at 86.8 in 2020.  I re-normalized the scores to set the top score (New Zealand’s 87.1 in 2016) to 100, with the rest then scaled in proportion to that.

The scores are thus shown as a proportion to what the best overall score was in 2016.  And this was done not just for the 2016 scores but also for the 2020 scores.  Importantly, the 2020 scores are not taken as a proportion of the best score for any country in 2020, but rather as a proportion of what the best score was in 2016.  Those scores are shown in the last two columns of the table, first for 2016 (as a proportion of what the best score was in 2016 – New Zealand at 87.1), and then for 2020 (again as a proportion of what the best score was in 2016 – New Zealand’s 87.1).  Note that on this scaling, New Zealand in 2016 would be 100.00, while in 2020 its rating would fall very slightly to 99.66 (as the Doing Business Score for New Zealand fell slightly from 87.1 in 2016 to 86.8 in 2020, and 86.8 is 99.66% of the 2016 score of 87.1).

The first two columns in the table show the rankings of countries, as the Doing Business reports have them now, for 2016 and then for 2020.  The sequence in the table is according to the ranking in 2016.  The third (middle) column then shows where a country would have ranked in 2016, had they had then the business environment that they had in 2020.  These are shown as fractional “rankings”, where, for example, if a country’s score in 2020 was halfway between the scores of countries ranked #10 and #11 in 2016, then that country would have a “rank” of 10.5.  That is, that country would have ranked between those ranked #10 and #11 in 2016, had they had a business environment in 2016 that they in fact had in 2020.

This now provides an absolute measure of country performance over time.  It is a comparison to where they would have ranked in 2016 had they had then what their policies later were.  This will then provide a more accurate account of what has been happening in the business environment in the country.  Take the case of Germany, for example.  It ranked #16 in 2016 and then lower at #22 in 2020.  Many would interpret this as a business environment that had deteriorated over the period.  But that is in fact not the case.  The absolute score was 91.27 in 2016 and a slightly improved 91.50 in 2020.  The business environment became slightly better (as assessed) between those two years, and it would have moved up a bit to a “rank” of 15.3 if it had had in 2016 the business environment it had in 2020.  But because a number of countries below Germany in 2016 saw a larger improvement in their business environment by 2020 than that of Germany, the relative ranking of Germany fell to #22.

Some of the differences could be large.  El Salvador, for example, ranked #80 in 2016 and fell to #91 in the traditional (relative) ranking for 2020.  But its business environment in fact improved significantly over the period, where the business environment it had in 2020 would have placed it at a ranking of 70.0 in 2016.  That is, it would have seen an improvement by 10 positions between the two periods rather than a deterioration of 11.  But other countries moved around as well, changing the relative ranking even though in absolute terms the business environment in El Salvador became significantly better.

D.  The Politics

One can understand how politicians in some country could grow frustrated if, after a period where they pushed through substantial (and possibly politically costly) reforms that aimed to improve their business environment, they then found that their Doing Business ranking nonetheless declined.  One might try to explain that other countries were also changing, and that despite their improvement other countries improved even more and hence pushed them down in rank.  This can follow when the focus is on relative rankings.  But this is not likely to be very convincing, as what the politicians see reported in the news media is the relative rankings.

A calculation of how the ratings would have changed in absolute terms would not suffer from this problem.  Countries would be credited for what they accomplished, not based on how what they did compares to what other countries might have done.  If all countries improved, then one would see that reflected when an absolute scale is used.  But when the focus is on relative rankings, any move up in rank must be matched by someone else moving down.  It is a zero-sum game.

Such a change in approach might have made the Doing Business reports politically more palatable.  It might also have reduced the pressure from countries such as China.  Throughout the period of 2016 to 2020, China was taking actions which, in terms of the Doing Business measures, were leading to consistently higher absolute scores each year, including for 2018.  With an absolute scale, such as that proposed here, one would have seen those year-by-year improvements in China’s position relative to where it would have been in a base year ranking (2016 for example).  The improvements were relatively modest in 2017 and 2018, and then much more substantial in 2019 and 2020.  But despite that (modest) improvement in 2018, China’s relative ranking in that year would have fallen 7 places to #84 from the #78 rank in the published 2017 report.  China found this disconcerting, and the WilmerHale report describes how Bank staff were then pressured to come up with a way to keep China’s (relative) ranking no worse than the #78 position it had in 2017.  And they then did.

At this point, however, a move to a presentation in terms of an absolute scale is too late for the Doing Business project as it has operated up to now.  Following the release of the WilmerHale report, the World Bank announced on September 16 that it would no longer produce the Doing Business report at all.  Whatever it might do next, if anything, will likely be very different.

 

2016  Rank 2020 Rank Rank with 2020 policy but 2016 ladder 2016 fraction of 2016 best 2020 fraction of 2016 best
New Zealand 1 1 1.1 100.00 99.66
Singapore 2 2 1.4 97.47 98.97
Denmark 3 3 1.8 97.01 97.93
Hong Kong, China 4 4 1.8 96.79 97.93
United States 5 5 4.4 95.98 96.44
United Kingdom 6 8 5.3 95.64 95.87
Korea, Rep. 7 6 4.4 95.41 96.44
Norway 8 9 7.4 93.92 94.83
Sweden 9 10 7.8 93.69 94.14
Taiwan, China 10 15 10.5 93.34 92.88
Estonia 11 18 10.9 92.42 92.54
Australia 12 14 10.1 92.31 93.23
Finland 13 20 12.7 91.96 92.08
Canada 14 23 15.7 91.62 91.39
Ireland 15 24 15.7 91.62 91.39
Germany 16 22 15.3 91.27 91.50
Latvia 17 19 12.3 90.82 92.19
Iceland 18 26 19.0 90.70 90.70
Lithuania 19 11 9.0 90.70 93.69
Austria 20 27 20.5 90.47 90.36
Malaysia 21 12 9.3 90.24 93.57
Georgia 22 7 4.9 89.90 96.10
North Macedonia 23 17 10.8 89.44 92.65
Japan 24 29 22.8 88.98 89.55
Poland 25 40 27.0 88.29 87.72
Portugal 26 37 25.8 87.72 87.83
Switzerland 27 36 25.6 87.72 87.94
United Arab Emirates 28 16 10.5 87.60 92.88
Czech Republic 29 41 28.0 87.37 87.60
France 30 32 25.2 87.37 88.17
Mauritius 31 13 9.3 87.26 93.57
Spain 32 30 23.0 87.14 89.44
Netherlands 33 42 30.0 86.68 87.37
Slovak Republic 34 45 32.7 85.88 86.80
Slovenia 35 38 25.8 85.76 87.83
Russian Federation 36 28 22.2 85.07 89.78
Israel 37 34 25.4 83.81 88.06
Romania 38 55 36.7 83.47 84.16
Bulgaria 39 61 41.0 83.24 82.66
Belgium 40 46 33.7 83.12 86.11
Cyprus 41 52 36.6 82.66 84.27
Thailand 42 21 13.0 82.55 91.96
Italy 43 58 37.3 82.32 83.70
Mexico 44 60 40.0 82.20 83.12
Croatia 45 51 36.5 81.97 84.50
Moldova 46 48 35.5 81.97 85.42
Chile 47 59 38.5 81.75 83.35
Hungary 48 53 36.6 81.63 84.27
Kazakhstan 49 25 15.7 81.40 91.39
Montenegro 50 50 36.3 81.06 84.73
Serbia 51 44 32.5 80.37 86.91
Luxembourg 52 72 51.5 79.45 79.91
Armenia 53 47 35.3 79.33 85.53
Turkey 54 33 25.2 79.33 88.17
Colombia 55 65 50.8 79.10 80.48
Belarus 56 49 35.7 78.87 85.30
Puerto Rico 57 66 50.8 78.87 80.48
Costa Rica 58 74 52.0 77.73 79.45
Morocco 59 54 36.6 77.38 84.27
Peru 60 76 57.0 77.15 78.87
Rwanda 61 39 25.8 77.04 87.83
Greece 62 79 57.3 76.81 78.53
Bahrain 63 43 31.0 76.46 87.26
Qatar 64 77 57.0 76.35 78.87
Oman 65 68 51.0 76.12 80.37
Jamaica 66 71 51.4 76.00 80.02
South Africa 67 84 61.5 76.00 76.92
Botswana 68 87 67.0 75.20 76.00
Azerbaijan 69 35 25.4 75.09 88.06
Mongolia 70 80 57.9 74.97 77.84
Bhutan 71 89 67.3 74.51 75.77
Panama 72 86 63.0 74.28 76.46
Tunisia 73 78 57.0 74.17 78.87
Ukraine 74 64 50.7 73.71 80.60
Bosnia and Herzegovina 75 90 69.0 73.59 75.09
St. Lucia 76 93 75.7 72.90 73.13
Kosovo 77 56 36.8 72.33 84.04
Fiji 78 101 86.0 72.10 70.61
Vietnam 79 70 51.3 71.87 80.14
El Salvador 80 91 70.0 71.64 74.97
China 81 31 24.3 71.53 88.75
Malta 82 88 67.1 71.53 75.89
Indonesia 83 73 51.5 71.30 79.91
Guatemala 84 96 80.0 70.84 71.87
Uzbekistan 85 69 51.1 70.84 80.25
Dominica 86 111 92.0 70.61 69.46
San Marino 87 92 74.0 70.49 73.71
Trinidad and Tobago 88 105 90.0 70.49 70.38
Kyrgyz Republic 89 81 57.9 70.38 77.84
Tonga 90 103 88.0 70.38 70.49
Kuwait 91 83 59.0 69.69 77.38
Zambia 92 85 62.0 69.46 76.81
Samoa 93 98 83.0 69.12 71.30
Uruguay 94 102 86.0 69.12 70.61
Namibia 95 104 88.0 68.89 70.49
Nepal 96 94 76.7 68.54 72.56
Vanuatu 97 107 90.3 68.08 70.15
Antigua and Barbuda 98 113 92.7 67.97 69.23
Saudi Arabia 99 62 44.0 67.97 82.20
Sri Lanka 100 99 83.8 67.97 70.95
Seychelles 101 100 85.0 67.74 70.84
Philippines 102 95 79.0 66.82 72.10
Albania 103 82 58.0 66.70 77.73
Paraguay 104 124 100.5 66.70 67.85
Kenya 105 57 36.8 66.59 84.04
Dominican Republic 106 115 95.0 66.48 68.89
Barbados 107 128 106.0 66.25 66.48
Brunei Darussalam 108 67 50.8 66.02 80.48
Bahamas, The 109 119 95.3 65.56 68.77
Ecuador 110 129 107.0 65.56 66.25
St. Vincent and the Grenadines 111 130 111.0 65.56 65.56
Ghana 112 116 95.0 65.44 68.89
Eswatini 113 121 96.5 65.33 68.31
Argentina 114 126 101.0 65.10 67.74
Jordan 115 75 54.5 65.10 79.22
Uganda 116 117 95.0 64.98 68.89
Honduras 117 133 116.6 64.41 64.64
Papua New Guinea 118 120 95.7 64.29 68.66
Brazil 119 125 100.5 63.83 67.85
St. Kitts and Nevis 120 139 126.5 63.83 62.69
Belize 121 134 120.5 63.61 63.72
Iran, Islamic Rep. 122 127 101.6 63.61 67.16
Lesotho 123 122 96.8 63.38 68.20
Egypt, Arab Rep. 124 114 94.5 62.80 69.00
Lebanon 125 143 127.7 62.80 62.34
Solomon Islands 126 136 122.5 62.80 63.49
India 127 63 48.5 62.57 81.52
West Bank and Gaza 128 118 95.0 62.23 68.89
Nicaragua 129 142 127.3 62.11 62.46
Grenada 130 146 131.0 61.54 61.31
Cabo Verde 131 137 123.4 61.31 63.15
Palau 132 145 129.8 60.96 61.65
Cambodia 133 144 129.6 60.73 61.77
Mozambique 134 138 123.4 60.62 63.15
Maldives 135 147 131.3 60.16 61.19
Tajikistan 136 106 90.0 59.47 70.38
Guyana 137 135 120.5 58.55 63.72
Burkina Faso 138 151 136.5 58.21 59.01
Marshall Islands 139 153 137.3 58.09 58.44
Pakistan 140 108 90.5 57.86 70.03
Côte d’Ivoire 141 110 91.0 57.75 69.69
Mali 142 148 133.0 57.75 60.73
Bolivia 143 150 136.1 57.18 59.36
Malawi 144 109 90.7 57.06 69.92
Tanzania 145 140 127.0 57.06 62.57
Senegal 146 123 97.0 56.95 68.08
Benin 147 149 135.0 55.91 60.16
Nigeria 148 131 113.0 55.57 65.33
Lao PDR 149 154 137.7 55.34 58.32
Micronesia, Fed. Sts. 150 158 150.0 55.22 55.22
Zimbabwe 151 141 127.0 54.88 62.57
Sierra Leone 152 162 151.3 53.85 54.54
Togo 153 97 82.0 53.85 71.53
Suriname 154 163 151.3 53.27 54.54
Niger 155 132 113.5 53.16 65.21
Comoros 156 160 150.7 53.04 54.99
Gambia, The 157 155 142.0 53.04 57.75
Burundi 158 165 153.2 52.35 53.73
Sudan 159 171 160.5 52.24 51.44
Kiribati 160 164 153.0 52.12 53.85
Djibouti 161 112 92.0 50.86 69.46
Guinea 162 156 146.2 50.86 56.72
Algeria 163 157 147.3 50.75 55.80
Gabon 164 168 160.4 50.52 51.66
Ethiopia 165 159 150.3 50.29 55.11
Mauritania 166 152 136.9 50.29 58.67
São Tomé and Principe 167 169 160.4 50.29 51.66
Syrian Arab Republic 168 176 170.5 49.37 48.22
Iraq 169 172 160.6 49.25 51.32
Myanmar 170 166 153.2 48.34 53.73
Cameroon 171 167 157.2 48.11 52.93
Madagascar 172 161 151.1 48.11 54.76
Bangladesh 173 170 160.4 46.96 51.66
Guinea-Bissau 174 174 167.8 46.38 49.60
Equatorial Guinea 175 178 172.8 45.92 47.19
Liberia 176 175 167.8 45.92 49.60
Afghanistan 177 173 163.5 45.12 50.63
Timor-Leste 178 181 176.9 45.12 45.24
Congo, Rep. 179 180 176.7 44.66 45.35
Yemen, Rep. 180 187 188.0 44.09 36.51
Haiti 181 179 173.4 43.28 46.73
Angola 182 177 172.6 43.17 47.42
Chad 183 182 182.3 40.53 42.37
Congo, Dem. Rep. 184 183 182.6 39.49 41.56
Venezuela, RB 185 188 188.2 39.15 34.67
South Sudan 186 185 183.8 37.66 39.72
Libya 187 186 186.5 37.43 37.54
Central African Republic 188 184 182.9 36.74 40.87
Eritrea 189 189 188.9 24.00 24.80
Somalia 190 190 190.0 23.19 22.96

 

The Economics of Rocket and Spacecraft Development: What Followed From Obama’s Push for Competition

A.  Introduction

The public letter was scathing, and deliberately so.  Made available to the news media in April 2010 just as President Obama was preparing to deliver a major speech on his administration’s strategy to put the US space program back on track, the letter bluntly asserted that the new approach would be “devastating”.  Signed by former astronauts Neil Armstrong (Commander of Apollo 11, and the first man to walk on the moon), Jim Lovell (Commander of the ill-fated Apollo 13 mission), and Eugene Cernan (Commander of Apollo 17, and up to now the last man to walk on the moon), the letter said that reliance on commercially contracted entities to carry astronauts to orbit “destines our nation to become one of second or even third rate stature”.  The three concluded that under such a strategy, “the USA is far too likely to be on a long downhill slide to mediocrity”.

What was the cause of this dramatic concern?  Upon taking office in January 2009, the Obama administration concluded that a thorough review was needed of NASA’s human spaceflight program.  Year’s earlier, following the breakup of the Columbia Space Shuttle as it tried to return from orbit – with the death of all on board – the Bush administration had decided that the Space Shuttle was not only expensive but also fundamentally unsafe to fly.  Due to its cost while still flying the Shuttle, NASA did not have the funds to develop alternatives.  The Bush administration therefore decided to retire the then remaining Space Shuttles by 2010.  The Obama administration later added two more Space Shuttle flights to allow the completion of the International Space Station (ISS), but the final Space Shuttle flight was in 2011.  The Bush administration plan was that the funds saved by ending the Space Shuttle flights would be used to develop what they named the Constellation program.  Under Constellation, two new space boosters would be developed – Ares I to launch astronauts to the ISS in low earth orbit and Ares V to launch astronauts to the moon and possibly beyond.  A new spacecraft, named Orion, to carry astronauts on these missions would also be developed.

To fund Constellation, the Bush administration plan was also to decommission the ISS in 2015, just five years after it would be completed.  Work on the ISS had begun in 1985 – when Reagan was president -, the first flight to start its assembly was in 1998, and assembly was then expected to be completed in 2010 (in the end it was in 2011).  The total cost (as of 2010) had come to $150 billion.  But in order to fund Constellation, the Bush administration plan was to shut down the ISS just five years later, and then de-orbit it for safety reasons to burn it up in the atmosphere.

The Obama administration convened a high-level panel to review these plans.  Chaired by Norman Augustine, the former CEO of Lockheed Martin (and commonly referred to as the Augustine Commission), the committee issued its report in October 2009.  They concluded that the Constellation program was simply not viable.  Their opening line in the Executive Summary read “The U.S. human spaceflight program appears to be on an unsustainable trajectory.”  Mission plans (including the time frames) were simply unachievable given the available and foreseeable budgets.  There would instead be billions of dollars spent but with the intended goals not achieved for decades, if ever.  A particularly glaring example of the internal inconsistencies and indeed absurdities was that the Aries I rocket, being developed to ferry crew to the ISS, would not see its first flight before 2016 at the earliest.  Yet the ISS would have been decommissioned and de-orbited by then.

The Augustine Commission recommended instead to shift to contracting with private entities to ferry astronauts to orbit.  Such a program for the ferrying of cargo supplies to the ISS had begun during the Bush administration.  By 2009 this program was already well underway, and the first such flight, by SpaceX using its Falcon 9 rocket, was successfully completed in May 2012.  The commission also recommended that work be done to develop the technologies that could be used to determine how a new heavy-lift launch vehicle should best be designed.  For example, would it be possible to refuel vehicles in orbit?  If so, the overall size of the booster could be quite different, as there would no longer be a need to lift both the spacecraft and the fuel to send it on to the Moon or to Mars or to wherever, all on one launch.  And the commission then laid out a series of options for exploration that could be done with a new heavy-lift rocket (whether a new version of the Ares V or something else), including to the Moon, to Mars, to asteroids, and other possibilities.  It also recommended that the life of the ISS be extended at least to 2020.

And it was not just the Augustine Commission expressing these concerns.  Earlier, in a report issued in August 2009, the GAO stated that “NASA is still struggling to develop a solid business case … needed to justify moving the Constellation program forward into the implementation stage”.  It also noted that NASA itself, in an internal review in December 2008 (i.e. before Obama was inaugurated) had “determined that the current Constellation program was high risk and unachievable within the current budget and schedule”.  The GAO also noted that Ares I was facing important technical challenges as well (including from excessive vibration and from its long narrow design, where there was concern this might cause it to drift into the launch tower when taking off).  While it might well be possible to resolve these and other such technical challenges given sufficient extra time and sufficient extra money, it would require that extra time and extra money.

President Obama’s strategy, as he laid out in a speech at the Kennedy Space Center on April 15, 2010 (but which was already reflected in his FY2011 budget proposals that had been released in February), was built on the recommendations of the Augustine Commission.  The proposal that received the most attention was that to end the Ares I program and to contract instead with competing commercial providers to ferry crews to the ISS.  And rather than continue on the Ares V launch vehicle (on which only $95 million had been spent by that point, in contrast to $4.6 billion on Ares I), the proposal was first to spend significant funds (more than $3 billion over five years) to develop and test relevant new technologies (such as in-orbit refueling) to confirm feasibility before designing a new heavy-lift launch vehicle.  That design would then be finalized no later than 2015.  Third, work would continue on the Orion spacecraft, but with a focus on its role to carry astronauts beyond Earth orbit, as well as to serve as a rescue vehicle should one be needed in an emergency for the ISS.  Fourth, the life of the ISS itself would be extended to at least 2020 from the Bush plan to close and destroy it in 2015.  And fifth, Obama proposed that the overall NASA budget be increased by $6 billion over five years over what had earlier been set.

While the proposal was well received by some, there were also those who were vociferously opposed – Armstrong, Lovell, and Cernan, for example, in the letter quoted at the top of this post.  But perhaps the strongest, and most relevant, opposition came from certain members of congress.  Congress would need to approve the new strategy and then back it with funding.  Yet several key members of Congress, with positions on the committees that would need to approve the new plans and budgets, were strongly opposed.  Indeed, this opposition was already being articulated in late 2009 and early 2010 as the direction the Obama administration was taking (following the issuance of the Augustine Commission report) was becoming clear.

Perhaps most prominent in opposition was Senator Richard Shelby of Alabama, who repeatedly spoke disparagingly of the commercial competitors (meaning SpaceX primarily) who would be contracted to ferry astronauts to the ISS.  In a January 29, 2010, statement, for example (released just before the FY2011 budget proposals of the Obama administration were to be issued), Shelby asserted “China, India, and Russia will be putting humans in space while we wait on commercial hobbyists to actually back up their grand promises”.  Shelby called it “a welfare program for amateur rocket companies with little or nothing to show for the taxpayer dollars they have already squandered”.

Shelby was not alone.  Other senators and congressmen were also critical.  Most, although not all, were Republicans, and one might question why those who on other occasions would articulate a strong free-market position, would on this issue argue for what was in essence a socialist approach.  The answer is that under the traditional NASA process, much of the taxpayer funds that would be spent (many billions of dollars) would be spent on federal facilities and on contractors in their states or congressional districts.  The Marshall Space Flight Center in Huntsville, Alabama, was the lead NASA facility for the development of the Ares I and Ares V rockets, and Senator Shelby of Alabama was proud of the NASA money he had directed to be spent there.  Senators and congressmen from other states with the main NASA centers involved or with the major contractors (Texas, Florida, Mississippi, Louisiana, Utah) were also highly critical of the Obama initiative to introduce private competition.

The outcome, as reflected in the NASA Authorization Act of 2010 (passed in October 2010) and then in the FY2011 budget passed in December, was a compromise.  The administration was directed basically to do both.  The legislation required that a new heavy-lift rocket be designed immediately, with the key elements similar to and taken from the Ares V design (and hence employ the same contractors as for the Ares V).  It was eventually named the Space Launch System (or SLS – or as wags sometimes called it “the Senate Launch System” since the key design specifics were spelled out and mandated in the legislation drafted in the Senate).  It also directed that the cost should be no more than $11.5 billion and that it would be in operation no later than the end of 2016.  (As we will discuss below, the SLS has yet to fly and is unlikely to before 2022 at the earliest, and over $32 billion will have been spent on it before it is operational.)

The compromise also allowed the administration to proceed with the development of commercial contracts to ferry astronauts to the ISS, but with just $307 million allocated in FY2011 rather than the $500 million requested.  For FY2011 to FY 2015, only $2,725 million of funding was eventually approved by congress, or well less than half of the $5,800 million originally requested by the Obama administration in 2010 for the program.  As a result, the commercial crew program, as it was called, was delayed by several years.  The first substantial contracts (aside from smaller amounts awarded earlier to various contractors to develop some of the technologies that would be used) were signed only in August 2012.  At that time, $440 million was awarded to SpaceX, $460 million to Boeing, and $212.5 million to Sierra Nevada Corporation, to develop the specifics of their competing proposals to ferry astronauts to the ISS.

The primary contracts were then awarded to SpaceX and to Boeing in September 2014.  NASA agreed to pay SpaceX a fixed total of $2,600 million, and Boeing what was supposed to be a fixed total of $4,200 million (but with an additional $287.2 million added later, when Boeing said they needed more money).  Each of these contracts would cover the full costs of developing a new spacecraft (Crew Dragon for SpaceX and Starliner for Boeing) and then of flying them on the rockets of their choice (Falcon 9 for SpaceX and the Atlas V for Boeing) to the ISS for six operational missions (with an expected crew of four on each, although the capsules could hold up to seven).  The contracts would cover not only the cost of the rockets used, but also the costs of an unmanned test flight to the ISS and then a manned test flight to the ISS with a crew of two or more.  If successful, the six operational missions would then follow.

We now know what has transpired in terms of missions launched.  While the SLS is still to fly on even a first test mission (the current schedule is for no earlier than November 2021, but many expect it will be later), SpaceX successfully carried out an unmanned test of its spacecraft (Crew Dragon) in a launch and docking with the ISS in March 2019, a successful test launch with a crew of two to the ISS in May 2020, an operational launch with a NASA crew of four to the ISS in November 2020, and a second operational launch with a NASA crew of four to the ISS in April 2021 (where I have included under “NASA” crews from space agencies of other nations working with NASA).  The April 2021 mission also reused both a Crew Dragon capsule from an earlier mission (the one used on the two-man test flight in May 2020) and a previously used first stage booster for the Falcon 9 rocket.  Previously, out of caution, NASA would only allow a new Falcon 9 booster to be used on these manned flights – not one that had been flown before.  They have now determined that the reused Falcon 9 boosters are just as safe.

As I write this, the plans are for a third operational flight of the Crew Dragon, again carrying four NASA astronauts to the ISS, in late October or November 2021.  And as I am writing this, SpaceX has just launched (on September 15) a private, all-civilian, crew of four for a three-day flight in earth orbit.  They are scheduled to return on September 18.  And there may be a second such completely commercial flight later in 2021.

Boeing, in contrast, has not yet been successful.  While Boeing was seen as the safe, traditional, contractor (in contrast to the “amateur hobbyists” of SpaceX), and received substantially higher funding than SpaceX did for the same number of missions, its first, unmanned, test launch in December 2019, failed.  The upper stage of the rocket burned for too long due to a software issue, and the spacecraft ended up in the wrong orbit.  While they were still able to bring the spacecraft back to earth, later investigations found that there were a number of additional, possibly catastrophic, software problems.  After a full investigation, NASA called for 61 corrective actions, a number of them serious, to be taken before the spacecraft is flown again.

As I write this, there have been further delays with the Boeing Starliner.  After several earlier delays, a re-run of the unmanned test mission of the capsule was scheduled to fly on July 30, 2021.  However, on July 29, a newly arrived Russian module attached to the ISS began to fire its thrusters due to a software error, causing the ISS to start to spin.  While it was soon brought under control, the decision was made to postpone the flight test of the Boeing Starliner by a few days, to August 3, to allow time for checks to the ISS to make sure there was no serious damage from the Russian module mishap.  But then, in the countdown on August 3 problems were discovered in the Starliner’s control thrusters.  Many of the valves were stuck.  On August 13, the decision was made to take down the capsule from the booster rocket, return it to a nearby facility, confirm the cause of the problem (it appears that Teflon seals failed), and fix it.  There will now be a delay of at least two months, and possibly into 2022.

Thus the unmanned test flight of the Boeing Starliner will only be flown at least two and a half years (and possibly three years, or more) after the successful unmanned test flight of the SpaceX Crew Dragon capsule in March 2019.  And as noted before, Boeing was supposed to be the safe choice of a traditional defense and space contractor, in contrast to the hobbyists at SpaceX.

While flight success is, in the end, the most important and easy to observe metric, also important is how much these alternative approaches cost.  That will be the focus of this post.  The cost differences are huge.  While not always easy to measure (this will be discussed below), the differences in the costs between the traditional NASA contracting and the more commercial contracts that paid for the services delivered are so large that any uncertainty in the cost figures is swamped by the magnitude of the estimated differences.

We will first look at the costs of developing and flying the principal heavy-lift rockets now operational in the US.  While they have different capabilities, which I fully acknowledge, the differences in the costs cannot be attributed just to that.  We will then look at the costs of developing and flying the three capsule spacecraft we now have (or will soon have) in the US:  the SpaceX Crew Dragon, the Boeing Starliner (more properly, the Boeing CST-100 Starliner), and the Orion being built under contract to NASA by Lockheed Martin.  The differences in capabilities here are also significant, but one cannot attribute the huge cost differences just to that.

This blog post is relatively long, with a good deal of discussion on the underlying basis of the estimates for the various figures as well as on the capabilities (and comparability) of the various rockets and spacecraft reviewed.  For those not terribly interested in such aspects of the US space program, the basic message of the post can be seen simply by focussing on the charts.  They are easy to find.  And the message is that NASA contracting on the commercial basis that the Obama administration proposed for the carrying of crew to the ISS (and which the Bush administration had previously initiated for the carrying of cargo to the ISS) has been a tremendous success.  SpaceX is now routinely delivering both cargo and crews to orbit, and at a cost that is a small fraction of what is found with the traditional NASA approach.  One sees this in both the development and operational costs, and the differences are so large that one cannot attribute this simply to differing missions and capabilities.

B.  The Rockets Reviewed and Their History

The chart at the top of this post shows the cost per kilogram to launch a payload to low earth orbit by the primary heavy-lift launch vehicles currently being used (or soon to be used) in the US.  This is only for the cost of an additional rocket launch – what economists call the marginal cost.  The cost to develop the rocket itself is not included here, as that cost is fixed and largely the same whether there is only one launch of the vehicle or many.  We will look at those development costs separately in the discussion below.

To get to the cost per kilogram, one must start with what each rocket is capable of carrying to low earth orbit and then couple this with the (marginal) cost of an additional launch.  We will review all that below.  But first a note on the data underlying these figures.

For a number of reasons, comparable data on the costs and even the maximum lift capacities of these various rockets are not readily available.  One has to use a wide range of sources.  Among the primary ones I used (for both the payload capacities and the costs of the rockets discussed in this and in the following sections, as well as for the costs for the spacecraft discussed further below), one may look here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, here, and here.

There will, however, be issues with the precision of any such estimates, in particular for the costs.  For a number of reasons, such comparisons (again especially of the costs) are difficult to make.  Several of those reasons are discussed in an annex at the end of this post.  Due to the difficulties in making such comparisons, differences in costs per kilogram of payload lifted to orbit of 10 or 20% certainly, but also even of 40 or 50%, should not be viewed as necessarily significant.  However, we will see that the differences in costs between developing and launching rockets and spacecraft with the traditional NASA approach and the approach based on competition that Obama introduced to manned space flight are far greater than this.  Indeed, we will see that the costs are several times higher, and often even an order of magnitude or more higher.  Differences of such magnitude are certainly significant.

To start, rockets differ in capabilities, and one must adjust for that.  The most important measure is lifting capacity – how many kilograms of payload can be carried into orbit:

The rockets to be examined here are limited to US vehicles (hence none of those from China, Russia, Europe, and elsewhere) and to heavier boosters sizeable enough to carry manned vehicles.  Ares I is included even though it flew on only one test flight (and only a partial one at that) before its development was ended, in order to show how its capacity would have compared to other launchers.  It would be similar in size to the alternatives.  But its (incomplete) development costs were already more than an order of magnitude higher than that of the Falcon 9, as we will discuss below.

The other boosters to be examined here are the Falcon 9 (of which there are two versions – with the first stage booster either expended or recovered), the Atlas V and Delta IV Heavy (both made by the United Launch Alliance – a 50/50 joint venture of Boeing and Lockheed Martin that, when formed in 2006, had a monopoly on heavy launch vehicles in the US), the Space Launch System (still to be tested in its first launch), and the Falcon Heavy (of which there are also two versions, with the first stage boosters either expendable or recoverable).

As noted, the Falcon 9 can be flown in two versions – with the first stage booster either expended (allowed to fall into the ocean) or recovered.  Since the first stage of a rocket will normally be the most expensive part of a rocket, Elon Musk sought to develop a booster where the first stage could be recovered.  And he did.  (He also, for a time, sought to recover similarly the second stage of the Falcon 9, but ultimately abandoned this.  The cost of a second stage is less, so there is less benefit in recovering it, while the difficulty, and hence the cost, is greater.  He eventually concluded it was not worth it.)

The Falcon 9 first stage is recovered by flying it back either to the launch site or to a floating platform in the ocean, where it slows down and lands by re-igniting its engines.  The videos can be spectacular.  But this requires that a portion of the fuel be saved for the landing, and hence the maximum payload that can be carried is less.  However, the cost savings (discussed below) are such that the cost per kilogram to orbit will be lower.

I have not been able to find, however, a precise figure for what the payload penalty will be when the first stage is recovered.  SpaceX may be keeping this confidential.  SpaceX does provide, at its corporate website, a figure for the maximum payload on a Falcon 9.  It is 22,800 kilograms, but this has been interpreted to be what the payload will be when the first stage uses 100% of its fuel to launch the payload into orbit, with that booster then allowed to fall into the ocean.  But a figure for what the maximum payload can be when the booster is recovered is not provided.  The Wikipedia entry for the Falcon 9, for example, only provides a figure for what was a heavy load on an actual launch (with the booster recovered).  This does not mean this would be the maximum possible load.

For the calculations here, I therefore used the payload capacity figures for the Falcon Heavy, taking the ratio between the payload that can be carried in the fully recoverable version to the payload in the expended version.  Elon Musk has indicated that this payload penalty on the Falcon Heavy is about 10%.  Applying this ratio to the Falcon 9 full capacity figure of 22,800 kg, and rounding down to 20,000 kg, should be a reasonable estimate of the maximum payload on the recoverable version of the rocket, and close enough for the purposes here.  The ratios for the Falcon Heavy and the Falcon 9 should be similar, as the first stage of the Falcon Heavy is essentially three first stages of the Falcon 9 strapped together (with the second stage the same in each), and the fuel that would be needed to be saved to allow for the recoveries and landings of the first stage boosters should be similar.

With this configuration where the Falcon Heavy is essentially three Falcon 9 first stage boosters strapped together, SpaceX was able to build an extremely large booster.  It is currently by far the largest operational such vehicle in the US stable, and indeed is currently the largest in the world.  The SLS will be larger if it becomes operational, but it is not at that point yet.  By building on the Falcon 9, the development costs of the Falcon Heavy were relatively modest, although Elon Musk has noted it turned out to be more complicated than they had at first thought it would be.  And with the three boosters that make up the first stage of Falcon Heavy similarly recoverable, one has even more spectacular videos of pairs of the boosters landing together back at the launch site (the third is recovered on a barge in mid-ocean). There is some penalty in the maximum payload weight that can be carried (about 10% as noted above), but the cost savings far exceed this (discussed below), leading to a cost per kilogram of payload that is almost a third less than when these first-stage boosters are not recovered.

The Atlas V and the Delta IV Heavy are both produced by the United Launch Alliance, the 50/50 joint venture of Boeing and Lockheed.  Its creation in 2006 by bringing together into one company the sole two providers in the US at that time of large launch vehicles was questioned by many.  The first launch of the Falcon 9 came only later, in 2010.  But the primary customer was and is the US Department of Defense, and they approved it (and may indeed have encouraged it) as their primary concern was to preserve an assured ability to fly their payloads into orbit rather than the cost of doing so.

The Atlas series of vehicles were brought to the ULA from Lockheed (via a string of corporate mergers – the Atlas rocket was first developed by General Dynamics), and have a long history stretching back to the initial Mercury orbital launches (of Glenn in 1962) and indeed even before that.  The models have of course changed completely over the years, and importantly since the year 2000 with the first launch of the Atlas III model which used Russian-made RD-180 engines.  The RD-180 engines are now being phased out for national security reasons, but the planned follow on rocket (named the Vulcan, or more properly the Vulcan Centaur), has yet to fly.  The Vulcan will use engines made by Blue Origin, and there have been delays in getting those engines delivered for the initial test flights.

There are ten different models of the Atlas V that have flown, and several more were available if a customer was interested.  For the charts here, version #551 of the Atlas V has been used, as it has the heaviest lift capacity of the various versions and has flown at least ten times (as I write this in September 2021).

The Delta rocket also has a long history, with variants dating back to 1960.  it was originally built by Douglas Aircraft, which after a merger became McDonnell-Douglas, which was later acquired by Boeing.  Boeing then brought the Delta to the United Launch Alliance.  There are also several variants of Delta rockets that have been available, but the Delta IV Heavy version will be used in the charts here as it can carry the heaviest payload among them.  Until the first launch of the Falcon Heavy in 2018, the Delta IV Heavy had the greatest lift capacity of any rocket in the US stable.  But as one can see in the chart, the payload capacity of the Falcon Heavy is double that of the Delta IV Heavy.

The Space Launch System (SLS) dates to 2011, when the basic design was announced by NASA.  Key design requirements had been set, however, in congressional legislation drafted originally in the Senate and incorporated into the NASA Authorization Act of 2010 that was signed into law in October 2010.  As was discussed above, NASA was instructed by Congress to develop the SLS and in doing so that it should use the rocket technology that had been developed for the Space Shuttle and which would have been used for the Ares V.  The Space Shuttle technology dates from the 1970s with a first flight of the Shuttle in 1981.  Its main rocket engines (three at the rear of the Orbiter) were the RS-25, which burns liquid hydrogen and oxygen.  The Shuttle also had two solid rocket boosters attached, each of four segments.

The SLS design, following the mandates of Congress, uses four RS-25 engines in its core first stage.   Two solid rocket boosters, of the same type also as used for the Shuttle, are attached on the sides (although with five segments each rather than the four for the Shuttle).  The second stage of the SLS will use the RL-10 liquid-fueled engine – a design that dates to the 1950s and first flew in 1959.  Indeed, for the initial (Block 1) model of the SLS, the second stage is in essence the second stage that has been used for some time on the Delta III and Delta IV boosters.

The SLS design shares many elements with the Ares V booster that would have been part of the Constellation program begun by the Bush administration.  The first stage booster of the Ares V would have had five RS-25 rockets in its core (versus four of the RS-25s in the SLS) and also with two of the strap-on solid rocket boosters from the Shuttle (but with 5.5 segments each instead of the five on the SLS).  While work on the Ares V never progressed far beyond its design, with NASA spending only $95 million on it before it was canceled, the SLS is very much based on the design of the Ares V and with similar ties to the Space Shuttle.

The engine technologies have of course evolved substantially over time, with upgrades and refinements as more was learned.  And using existing designs should certainly have saved both time and money.  But neither happened.  Congress directed that the SLS should be operational by 2016, and early NASA plans were for it to be flying by 2017, but as of this writing it has yet to have had even a test flight.  As noted previously, the first test launch is currently scheduled for November 2021, but many expect this will be further delayed.  And as we will see below, despite the use of previously developed technology for most of the key components (in particular the rocket engines), the costs have been quite literally astronomical.

Finally, the Ares I booster is included here for comparison purposes.  Its first stage would have been the same solid rocket booster used for the Space Shuttle (but just one booster rather than two, and of five segments), while the second stage would have used a version of the J-2 liquid-fueled engine.  This engine was originally developed in the early 1960s for use in the upper stages of the Saturn 1B and Saturn V rockets then being developed for the Apollo program.  There have been numerous upgrades since, of course, and some would say the J-2 version developed for the Ares I (named the J-2X) was close to a new design.

There was only one, partial, test flight of the Ares I before the program was canceled.  That flight, in October 2009, was of the first stage only (the solid rocket booster derived from the Space Shuttle), with just dummies of the second stage and payload to simulate the flight dynamics of that profile.  It reached a height (as planned) of less than 30 miles.  While deemed a “success” by NASA, the launch caused substantial unanticipated damage to the launch pad, plus the parachutes designed to return the first stage partially failed.

As will be discussed below, while the Ares I never became operational, the amount spent on its (partial) development had already far exceeded that of comparable rockets.  It was also facing substantial technical issues that could be catastrophic unless solved (including from excessive vibration and a concern that with its tall, thin, design it might drift into the launch tower on lift-off).  Finally, as noted before, the rocket’s mission would be to ferry astronauts to the ISS, yet under the Bush administration’s plan to abandon and de-orbit the ISS by 2015 (in order to free up NASA funds for the Constellation program), the first operational flight (as forecast in 2009) would not be until 2016.  Nonetheless, Obama’s decision to cancel the program was severely criticized.

C.  The Cost of Developing the Rockets

In considering the costs of any vehicle, including rockets and spacecraft, one should distinguish between the cost of developing the technology and the cost of using it.  Development costs are upfront and fixed, regardless of whether one then uses the rocket for one launch or many.  Operational costs per launch are then a measure of what it would cost for an additional launch – what economists call the marginal cost.

While the concepts are clear, the distinction can be difficult to estimate.  The costs may often be mixed, and one must then try to separate out what the costs of just the launches were from the cost of developing the system.  But reasonable estimates are in general possible.

To start with development costs:

First, in the case of Ares I all the costs incurred were development costs as there were no operational launches.  Figures on this are provided in the NASA budget documents for each fiscal year.  A total of $4.6 billion was spent on the program between fiscal years 2005 (when the program was launched) and 2010 (when it was canceled).

But at that point the program was far from operational.  The first operational flight was not going to be before 2016 at the earliest, and very likely later.  To make the comparison similar to the costs of other rocket programs (which have reached operational status), one should add an estimate of what the additional costs would have been to reach that same point.  But there is only a partial estimate of what those additional development costs would have been.  As is standard, the FY2010 NASA budget had five-year cost forecasts (i.e. for the next four years following the request for the upcoming fiscal year) for each of the budget line items, and at that point the forecast was that the Ares I program would cost an additional $8.1 billion in fiscal years 2011 through 2014.  Furthermore, this expected expense of over $2 billion per year would not be declining over time but in fact rising a bit, and would likely continue for several years more at a similar rate or higher until Ares I was operational.

Even leaving out what the additional development costs would have been beyond FY2014 (probably an additional $2 billion per year for several more years), the expected costs through FY2014 would have already been huge, at $12.7 billion.  This is incredibly high for what should have been a relatively simple rocket (based on components that were already well used), although we will see similarly high costs in the development of the SLS.  Why they are so high is difficult to understand, particularly as the Ares I is a booster whose first stage is simply one of the solid rocket boosters from the Shuttle program (and indeed initially physically taken from the excess stock of such boosters left over at the end of the Shuttle program, although modified with an extra segment added in the middle to bring it to five segments from four).  And the second stage was to be built around an upgraded model of the old J-2 engine.

In sharp contrast to these costs for the Ares I, the development costs of the similarly sized Falcon 9 rocket as well as the far larger Falcon Heavy are tiny, at just $300 million and $500 million respectively.  Are those figures plausible?  Since SpaceX is not a publicly listed company, its financial statements are not published.  However, it does have funders (both banks and others providing loans, as well as those taking a private equity position) so financial information is made available to them.  While confidential, it often leaks out.  Plus there are public statements that Elon Musk and others have made.  And importantly, as a start-up founded in 2002, it was a small company without access to much in the way of funding in the period.  They could not have spent billions.

One should acknowledge, however, as Elon Musk repeatedly has, that NASA provided financial assistance at a critical point.  SpaceX, Tesla, and Elon Musk personally, were all running low on cash in 2006, were burning through it quickly, and would soon be out of funds.  Then, in late 2006, NASA awarded SpaceX a $278 million contract under its new COTS (Commercial Orbital Transportation Services) program, to be disbursed as identified milestones were reached.  SpaceX was among more than 20 competitors for funds under this program, with SpaceX and one other (Orbital Sciences, with its own launch vehicle and spacecraft) winning NASA support.  The funding to SpaceX was later raised to $396 million (with additional milestones added) and was used to support the development of the Falcon 9 rocket, of the original version of the Dragon capsule to ferry cargo to the ISS, and the cost then to fly three demonstration flights (later collapsed to two) showing that the systems worked.  The second (and final) demonstration mission was a fully successful launch in May 2012 of the Falcon 9 carrying the Dragon capsule with cargo for the ISS, which successfully docked with the ISS and later returned to earth.  Following this, NASA has contracted with SpaceX for a series of cargo resupply missions to the ISS under follow-on contracts (under CRS, for Commercial Resupply Services) where it is paid for each successful mission.  As of this writing, SpaceX is now at the 23rd flight under this program.

NASA funds were important.  But they were only partial and not large, at less than $400 million to support the development of the Falcon 9, the Dragon capsule for cargo, and the initial demonstration flights.  They are consistent with a cost of developing the Falcon 9 alone of about $300 million.

The specific figure of $300 million to develop the Falcon 9 comes from a statement Elon Musk made in May 2011 on SpaceX’s history to that point.  He wrote that total SpaceX expenditures up to that point had been “less than $800 million”, with “just over $300 million” for the development of the Falcon 9.  The rest was for the development of the Dragon spacecraft (used to deliver cargo to the ISS) for $300 million, the cost of developing and testing in five flights SpaceX’s initial rocket the Falcon 1 (which had a single Merlin engine newly developed by SpaceX – the Falcon 9 uses nine Merlin engines), the costs of building launch sites for the Falcon rockets at Cape Canaveral, Vandenberg, and Kwajalein in the Pacific, as well as the cost of building all the corporate manufacturing facilities for the Falcon rockets and the Dragon.  Musk noted that the financial accounts are confirmed by external auditors, as they would be for any sizeable firm.

Separately, in 2017 a Senior Vice President of SpaceX (Tim Hughes), in testimony to Congress, noted that the development cost of Falcon 9 had been $300 million and $90 million for the earlier Falcon 1 rocket, and that NASA had independently verified these figures (in the report here, as updated).

The $300 million cost estimate looks plausible.  Unlike NASA (as well as firms such as Boeing), as a new start-up SpaceX simply would not and did not have the funding to spend much more.  But even if it were several times this, it would still be far less than what the cost of the similarly sized Ares I had been.

The estimate of $500 million to develop the Falcon Heavy also comes from statements made by Musk.  It is also plausible.  As noted above, the Falcon Heavy is basically a set of three Falcon 9 first-stage boosters strapped together, topped by a second stage (as well as payload fairing) that is the same as that on the Falcon 9.  Musk has noted that it was not as easy to do develop the Falcon Heavy as they had initially expected (there are many complications, including the new aerodynamics of such a design), but even at $500 million the cost is a bargain compared to what NASA has spent to develop boosters.

The Space Launch System (SLS) has yet to fly.  As noted before, this will take place no earlier than November 2021, but many expect there will be further delays.  Furthermore, the plan is for only one test flight to be made.  It is not clear what will happen if this test flight is not successful.

One has in NASA budget documents how much has been spent each year for the SLS thus far, and what is anticipated will be required for the next several years.  A total of $26.3 billion will have been spent through FY2021 (i.e. to September 30, 2021).  But the SLS is not yet operational, and the NASA budgets do not provide a breakdown between the cost of developing the SLS and the cost of launching it.  And there is not a clear distinction between the two.  Indeed, even the initial test flight has been labeled the Artemis 1 mission.  It will not be manned, but it will carry the Orion spacecraft (also being tested) on a month-long flight that will take it to the moon, go into lunar orbit, and then leave lunar orbit to return to earth with a splashdown and recovery of the Orion.

If successful, the second launch of the SLS will not be until September 2023 at the earliest.  While this flight would be manned and would loop around the Moon, some, at least, consider it also a test flight – testing all the systems under the conditions of a crew on board.

In part this is semantics, but treating the period until the end of FY2023 as the SLS still in the development phase, the total NASA is expected to have spent developing the SLS will be $32.4 billion.  While its payload capacity is 50% larger than that of the Falcon Heavy, it would have cost 65 times as much to bring it to the point of being operational.  While there are of course important differences, it is difficult to understand why the development of the SLS will have cost 65 times, and possibly more, than the cost of developing the Falcon Heavy.  It is especially difficult to understand as the rocket engines (the main cost for a booster) of the SLS are models used on the Space Shuttle, the strap-on solid rockets are also from the Space Shuttle, and the RL10 engine used on the second stage is derived from that used on earlier US rockets, dating all the way back to the 1950s.

D.  The Cost of Launching the Rockets

Once developed, there is a cost for each launch.  One wants to know the pure marginal cost of an additional launch, excluding all of the development costs, as those costs are in the past and will be the same regardless of what is now done with the newly developed rocket (economists refer to those past costs as sunk costs).

In practice the costs can be difficult to separate.  For private, commercial, vehicles, there may be some public information on what the firm providing the launch services is charging, but the price being charged for any specific flight is often treated as private and confidential, where the agreed upon price was reached through a negotiations process.  And the price paid will presumably include some margin above the pure marginal costs to help cover (when summed across all the launches that will be done) the original cost of developing the rockets plus some amount for profits.  It is even more difficult to determine for the SLS, as one only has what is published in the NASA budget documents for the amount being spent on the overall SLS program, where that total combines the cost of both developing and then launching the vehicle.  NASA has not provided a break-down, and deliberately so.  But one does have in the budget numbers a year-by-year breakdown, which one can use as the development costs (for the initial version of the SLS) will largely be incurred before the vehicle becomes operational, and the operational costs after.  This will be used below.

Even with such provisoes, reasonable estimates of the costs are so hugely different that the basic message is clear:

SpaceX is most transparent on its costs.  Standard prices are given on its corporate website, of $62 million for a Falcon 9 launch and $90 million for a Falcon Heavy.  The site does not specify whether these are for the expendable or recoverable versions, but based on other information, it appears that the $62 million for the Falcon 9 reflects the cost of an expendable Falcon 9, while the $90 million for the Falcon Heavy is for the recoverable version.  The $62 million for the Falcon 9 is similar to what was charged in the early years for the Falcon 9 before the ability to recover its first stage booster was developed.  And Elon Musk has said that the cost of the fully expendable version of the Falcon Heavy maxes out at $150 million, which implies that the $90 million figure shown on its website is for the version where all three of the first stage boosters are recovered.

The $35 million figure for the cost of the Falcon 9 when its first stage is recovered is then an estimate based on a $62 million cost which is assumed to apply when the first stage cannot be recovered.  In an interview in 2018, Musk said that the cost of the first stage booster is about 60% of the cost of a Falcon 9 launch, with 20% for the second stage, 10% for the payload fairing, and 10% for the operations of the launch itself.  These are clearly rounded numbers, but based on them, 60% of $62 million is $37 million, with the remaining 40% then $25 million.  Assuming, generously, that the cost to refurbish the booster for a new flight, plus some amortization cost (e.g. $3.7 million per flight if it can be reused for 10 flights), would be $10 million, then a cost per flight with recovered first stage boosters would be about $35 million per flight.  This is broadly consistent with a statement made by Christopher Couluris (director of vehicle integration at SpaceX) in 2020 that SpaceX can bring down the cost per flight to “below $30 million per launch”, and that “[The rocket] costs $28 million to launch it, that’s with everything”.  The $35 million figure for the recoverable version of the Falcon 9 might well be on the high side, but as was noted previously, I am deliberately erring on the high side for the cost estimates of SpaceX and on the low side for the NASA vehicles.

Thus a figure of $35 million per launch of a Falcon 9 with the first stage booster recovered is a reasonable (and likely high) figure for what the cost is to SpaceX for such a launch.  The $62 million “list price” on the SpaceX website would then include what would be a generous (in relative terms) profit margin for SpaceX, covering the development costs and more.  According to the SpaceX website, as I write this there have been 125 launches of the Falcon 9 since its first flight in 2010, on 85 of these they have recovered the first stage booster, and on 67 flights they have reflown a recovered booster.  The first successful recovery of a first stage booster was in December 2015.

Competition matters, and following the more transparent prices being charged customers by SpaceX for the Falcon 9 and Falcon Heavy (at least transparent in terms of “list prices”), ULA in December 2016 set up a website called “RocketBuilder.com” where anyone can work out which model of the Atlas V they will need.  There are ten models available, carrying payloads to low earth orbit from a low of 9,800 kg for the Atlas V model 401, up to 18,850 kg for the Atlas V model 551.  As noted before, we are examining the model 551 here as its payload is closest to what the Falcon 9 can carry (22,800 kg in the expendable version and 20,000 kg in the recoverable version).  The RocketBuilder.com website was “launched” with substantial publicity on December 1, 2016, accompanied by an announcement of substantial cuts in their prices for the Atlas V.  The CEO of ULA, Tony Bruno, announced that prices for the Atlas V model 401 would start at $109 million – down from $191 million before.  The price of the Atlas V model 551 would be $179 million when combined with a “full spectrum” of additional ULA services.

When set up, the RocketBuilder.com website included, importantly, what the list price would be of the Atlas V rocket model chosen.  Unfortunately, the RocketBuilder.com website as currently posted does not show this.  The reason might be that the CEO of ULA recently announced, on August 26, that ULA will take no more orders for flights of the Atlas V.  The Atlas V uses the Russian-made RD-180 rocket engine (two for each booster), and for national security reasons ULA has been required to cease purchasing these engines.  It must instead develop a new booster with key components all made in the US.  The RD-180 is an excellent engine technically, and is also both highly reliable and relatively inexpensive.  The decision to purchase it, from Russia, was made in the 1990s, and its first flight (on an Atlas III booster) was made in May 2000.  But political conditions have changed, and the most important client for ULA is the US Defense Department.

ULA has now received its final shipment of six RD-180 engines from Russia, and there will be a further 29 Atlas V flights (of all models, not just the model 551) up to the mid-2020s, using up the stock of RD-180 engines ULA has accumulated.  They have now all been booked.  ULA now hopes to launch next year, in 2022, its first test flight of the rocket it has been developing to replace both the Atlas V series as well as the Delta IV Heavy, which it has named the Vulcan Centaur.  It will use the new BE4 engines being developed by Blue Origin.  But that first test flight has been repeatedly delayed.  The first test flight was originally planned for 2019.

However, while the current RocketBuilder.com website of ULA no longer shows the cost to a customer of a launch of an Atlas V, one can find the former prices at an archived version of the RocektBuilder.com website.  While these are prices from a few years ago, they do not appear to have changed (at least as list prices).

The selection is much like that of finding the list price of a new car by going to the manufacturer’s website, selecting the model, adding various options, and then additional services one might want.  For the Atlas V, one can choose various levels of services, from a “Core” option to “Signature”, “Signature pro”, “Full Spectrum”, and “other customization”.  These appear to relate mostly to the division of responsibilities between ULA and the customer on various aspects of integrating the payload with the rocket.  ULA also offers two service packages it calls “Mission Insight” (things such as special access to ULA facilities) and “Rocket Marketing” (pre-launch events, press materials, videos, even “mission apparel”) that provide different levels of services and access.  It is sort of like the higher levels of benefits granted by airlines to their frequent flyers, although here they charge an explicit price for the package.

On the archived website, selecting the payload capacity and orbit that will lead to an Atlas V model 551 being required, the base cost (in 2016) shows as $153 million (as I write this in September 2021).  However, with a “Signature” level of service (which might be the base level required, as the “Core” option is not being allowed for some reason), the cost will be $163 million.  And $173 million for the “Full Spectrum” package.

The website also prominently displays a line for “ULA Added Value” which is then subtracted from that cost.  This does not reflect an actual price reduction by ULA, but rather savings that ULA claims the customer will benefit from if they choose an Atlas V launch by ULA.  The base (default) value of these savings that ULA claims the customer will benefit from is $65 million.  A breakdown shows this is made up of a claimed reduction in insurance costs of $12 million (what it otherwise would have been is not shown – just the “savings”), $23 million because ULA claims they will launch when it is scheduled to be launched and not several months later (which is more than a bit odd – one could produce whatever “savings” one wants by assuming some degree of delay in launch otherwise), and $30 million for what they call “orbit optimization”, which is a claim that the orbit they will place it in will lead to a lifetime for the satellite that is 17 rather than 15 years.

With such “savings” of $65 million (in a base case), ULA claims the actual cost for an Atlas V model 551 launch would not be $163 million, say (in the case considered above), but $65 million less, and thus only $98 million.  While still more than 50% higher, this brings it closer to the Falcon 9 list price of $62 million.  But this all looks like a marketing ploy – indeed rather like a juvenile charade – as that number depends on supposed savings from hypothetical levels.  The amount paid to ULA would still be the $163 million in this example.  And Elon Musk, among others, have questioned the assumptions.

The commonly cited $165 million cost of a launch of an Atlas V is therefore a reasonable estimate.  One should, however, keep in mind that this is both a “list price” subject to negotiation and that depending on the specific options chosen, the price could easily vary by $10 or $20 million around this.  The “savings” figures of ULA should not be taken too seriously, however.  There will be specific factors affecting costs and possible savings with any given payload, for other rockets as well as the Atlas V, and comparisons to some hypothetical will depend on whatever is chosen for that hypothetical.

ULA has provided less public material on the cost of a Delta IV Heavy launch.  This is in part as all of the customers, since the initial test flight in 2004, have been US government entities, and in particular the US Department of Defense.  There have only been 12 launches since that initial test flight, with ten of these classified missions for the Defense Department and two for NASA.  Furthermore, only three more are planned (two in 2022 and one in 2023), with ULA offering the planned Vulcan Centaur rocket (of which there will be a series of models that can carry progressively larger payloads, like for the Atlas V) as a substitute that can carry payloads of a similar size.

Both the Defense Department (especially) and NASA are less than fully transparent on what they have paid for these Delta IV Heavy launches.  The specific costs of the launches can be buried in the broader costs of the overall programs.  But the figures cited for a Delta IV heavy launch have typically been either $400 million (in a statement by ULA in 2015) or $350 million more recently.  It may well have been that, under pressure from the far lower costs of SpaceX, ULA has reduced its price over time.  For the purposes here, and erring on the side of being generous to ULA, I have used for the calculations a price of $350 million for a launch of an additional Delta IV Heavy.

The cost of an additional SLS launch is an estimated $2 billion, but there are conceptual as well as other issues with this figure.  First of all, NASA refused to release to Congress, nor to anyone else for that matter, what the cost of an additional launch would be.  Rather, one only had a single line item in the budget for the combined year-by-year cost of developing and testing the SLS, and also for building and then flying it.  That cost reached $3.1 billion in FY2020, $3.1 billion also in FY2021 and again in FY2022, and with it then forecast to decline slowly but remain at $2.8 billion in FY2026.  The SLS has not yet flown, and its first (uncrewed and the only planned) test flight is now scheduled for November 2021.  The first operational flight (with a crew of four) would not be until 2023 at the earliest, with the second in 2024 at the earliest.  The NASA plan is that there would then be one flight per year starting in 2026 and continuing on into the indefinite future.

But an estimate of the cost of an additional launch of the SLS leaked out, possibly due to an oversight but possibly not, in a letter sent to Congress in October 2019 from OMB. The letter addressed a range of budget issues for all agencies of the government, and set out the position of OMB and hence the administration on matters then being debated.  One was on use of the SLS.  Senator Richard Shelby of Alabama, who was then chair of the Senate Committee on Appropriations, had included in the language of the draft budget bill a requirement that NASA use the SLS for the launch of the planned NASA Europa Clipper mission (a satellite to Europa, a moon of Jupiter).  In a paragraph on page 7 of the letter, OMB recommended against this, as there is “an estimated cost of over $2 billion per launch for the SLS once development is complete”.  The letter noted that a commercial launch vehicle could be used instead for a far lower cost.

NASA later admitted (or at least would not deny) that this would be a reasonable estimate of the additional cost of such a launch.  And it is consistent with the budget forecasts that the SLS program would continue to require funding of close to $3 billion each year once flights had begun (at a pace of one per year, or less).  While the $3 billion is still greater than a figure of $2 billion per flight, the development costs for the SLS program will not end when the first SLS booster is operational.  The initial SLS, while a sizeable rocket, would still not have the lifting capacity that would be needed (under current NASA plans) for the planned lunar landings following the very first.

Specifically, the initial model of the SLS (scheduled to be tested this November 2021) is labeled the Block 1, and has a lifting capacity of 95,000 kg to low earth orbit.  The figures for the Block 1 are the ones that are being used in the charts in this post.  However, its capacity would only be sufficient for the first three flights (including the test flight), where the third flight would support the first landing of a crew on the moon under the Artemis program.  Following that, a higher capacity model, labeled the SLS Block 1B, would need to be developed, with a lifting capacity to low earth orbit of 105,000 kg.  To achieve that, a new second stage would be developed using four of the RS-10 rocket engines (versus a second stage with just a single RS-10 engine in the Block 1 version).

Under current NASA plans the Block 1B version of the SLS would then be used only for four flights.  For missions after that an even heavier lift version of the SLS would be needed, with two, more powerful, solid rocket boosters strapped on to the first stage (instead of the solid rocket boosters derived from those used on the Space Shuttle).  These would increase the lifting capacity to 130,000 kg to low earth orbit.  Part of the reason for developing the Block 2, with the new solid rocket side boosters, is that NASA will have used up by then its excess inventory of solid rocket booster segments (from the Space Shuttle program) for the planned launches of the Block 1 and Block 1B versions of the SLS (with one set in reserve).  Using up the existing inventory makes sense.  It should save money – although those savings are difficult to see given the expense of this program.  But that inventory is limited and will suffice only for up to eight flights of the SLS.  Hence the need for a replacement following that, which led to the design for the Block 2.

There will be development costs for the new second stage (with four RS-10 engines rather than one) for the Block 1B and then for the new, more powerful, strap-on solid rocket boosters for the Block 2.  What share of the approximately $3 billion that would be spent each year for the development of these new models of the SLS has not been broken out in the NASA budget – at least not in what has been made public.  But given that only very limited work has been done thus far on the new second stage for the Block 1B and even less on the new solid rocket boosters to be used for the Block 2, continuing development costs of $1 billion per year looks plausible.

At $2 billion per flight, the cost of a SLS launch is huge.  And this does not include any amortization to cover the development costs.  As noted above, those costs are expected to reach over $32 billion by FY2023.  The costs per launch for the other rockets shown on the chart, including the Falcon Heavy, will include in the prices charged some margin to cover the original development costs.  Commercial companies must do this to recover the costs of their investments.  That amount would be gigantic if added for the SLS in order to make its cost figure more comparable to that of the alternatives.

The question is how many flights of the SLS there will be before a more cost-effective alternative starts to be used.  Note that the alternative need not be limited to another giant rocket with a similar lift capacity.  The SLS itself will not be large enough to carry in a single launch all that will be required on the Artemis missions to the moon.  Rather, there would be separate launches on a range of boosters to carry what would be required.  Indeed, a NASA plan developed in 2019 for the launches that will be necessary through to 2028 as part of the Artemis missions to the Moon envisaged 37 separate launches, of which only 8 would be of the SLS (including its initial test launch).  One can break up the cargoes in many different ways.

While speculative, and really only for the point of illustration, one might assume that there will be perhaps 10 flights of the SLS before more cost-effective alternatives are pursued.  If so, then to cover the over $32 billion development cost one would need to add over $3 billion per flight to make the figures comparable to the costs of the other, commercial, launchers.  That is, the cost would then be over $5 billion per flight for the SLS rather than $2 billion.  This would, however, now be more of an average cost per flight than a true marginal cost, and speculative as we do not know how many flights of the SLS there will ultimately be (other than that it will not likely be many, given its huge cost).  Hence I have kept to the $2 billion figure, which is already plenty high.

Even at $2 billion per flight for the SLS, the cost is over 13 times the cost of a Falcon Heavy (in the version where the boosters are all thrown into the ocean rather than recovered).  The lift capacity of the SLS is 50% more, but it is difficult to imagine that that extra capacity could only be achieved at a cost (even ignoring the huge development cost) that is more than 13 times as much.

What has happened on the Europa Clipper mission provides a useful lesson.  Following a review and consultations with Congress, the Biden administration on July 23, 2021, announced that the Europa Clipper would be flown on a Falcon Heavy instead of the SLS.  The total contract amount with SpaceX for all the launch services is just $178 million (which will include the special costs of this unique mission).  There were several reasons to make the change, in addition to the savings from a cost of $178 million rather than $2 billion.  One is simply that by using the Falcon Heavy they will be able to launch in October 2024.  No SLS will be available by that time, nor indeed for several years after.  While a more direct route to Jupiter would have been possible with the heavier lift capacity of the SLS, the Europa Clipper would have had to be kept in storage for several years until a SLS rocket became available.  Separately, NASA discovered there would be a severe vibration issue due to the solid rocket boosters on the SLS, which the delicate spacecraft would have not have been able to handle.  To modify the Europa Clipper to make it able to handle those vibrations would have cost an additional $1 billion.

Finally, it is clear that the politics has changed.  Senator Shelby of Alabama has been the figure most insistent on requiring use of the SLS to launch the Europa Clipper.  With the NASA Marshall Space Flight Center (located in Huntsville, Alabama) the lead NASA office responsible for the SLS, a significant share of what is being spent on the SLS is being spent in Alabama.  And as Chair of the Senate Committee on Appropriations, Senator Shelby was in a powerful position to determine what the NASA budget would be.  But as a Republican, Senator Shelby lost the chairmanship when the Senate came under Democratic control in January 2021, plus he has announced he will not run for re-election in 2022. His influence now is thus not what it was before, and NASA can now pursue a more rational course on the launch vehicle.

E.  The Cost of Developing and Operating Spacecraft for Crews

NASA has also used its new, more commercial, contracting approach for the development and then use of private spacecraft to carry crew to the ISS.  This was indeed the proposal of President Obama in 2010 that was so harshly criticized, as discussed at the beginning of this post.  We now know how that has worked out:  SpaceX is flying crews to the ISS routinely, while Boeing, a traditional aerospace giant which was supposed to be the safe choice, has had issues.  We also can compare the costs under this program (for both SpaceX and Boeing) to that of developing the Orion spacecraft, where Lockheed Martin is the prime contractor operating under the more traditional NASA contracting approach.

There are, of course, important differences between the Orion and the spacecraft developed by SpaceX (its Crew Dragon, sometimes referred to as the Crew variant of the Dragon 2 as the capsule is a model derived from the original Dragon capsule used for ferrying cargo to the ISS), and by Boeing (which it calls the CST-100 Starliner, or just Starliner for short).  The SpaceX Crew Dragon and the Boeing Starliner will both be used to ferry astronauts to the ISS in low earth orbit, while the Orion is designed to carry astronauts to the Moon and possibly beyond.

But there are important similarities.  They are all capsules, use heat shields for re-entry, and can seat up to six astronauts (Orion) or seven (Crew Dragon and Starliner), even though NASA plans so far have always been for flights of just four astronauts each time.  They are all, in principle, reusable spacecraft. The interior volume (habitable space for the astronauts) is 9 cubic meters on Orion, 9.3 cubic meters on Crew Dragon, and 11 cubic meters on Starliner.

Orion will also be launched with a Service Model attached, which is being built by Airbus under contract to the European Space Agency.  This Service Module will have the fuel and engines required to help send Orion from earth orbit to the moon, and then fully into lunar orbit and back, as well as power (from solar panels) and supplies of certain consumable items required for longer space flight durations.  With this, Orion will be able to undertake missions of up to 21 days.  The self-contained Crew Dragon can carry out missions of up to 10 days, while the Starliner has the capacity of just 2 1/2 days – providing time to reach the ISS and later return, but not much else.

The cost of developing and building the European Service Module for Orion is being covered by the European Space Agency as its contribution to the program.  For better comparability to the Crew Dragon and Starliner spacecraft, the costs of the Service Module for Orion have been excluded from the cost of Orion in the charts below, as it is primarily the Service Module that will give the Orion the capabilities to go beyond earth orbit – capabilities that the Crew Dragon and Starliner do not have.  Had the costs of the Service Module been included (as the Orion is, after all, dependant on it), the disparity in costs between it and the Crew Dragon or Starliner would have been even larger.

The development of Orion began in 2004, as part of the Constellation program of the Bush administration, and has continued ever since.  NASA spent $1.4 billion on it in FY2020, again in FY2021, and the budget proposal is to do so again in FY2022.  Aside from an uncrewed flight in 2014 that was principally to test its heat shield design, the Orion has yet to fly.  Its first real test, still unmanned, will be as part of the first test flight of the SLS, which as noted above is now scheduled for November 2021.

SpaceX and Boeing were awarded the new form of competitive contracts by NASA to build their new spacecraft, demonstrate that they work (with a successful unmanned test flight to the ISS and then a manned flight test), and then fly them on six regular missions carrying NASA astronauts to the ISS.  The designs were by the companies – NASA was only interested in safe and successful flights ferrying crew to the ISS.

Each contractor could use whatever booster they preferred (SpaceX chose the Falcon 9 and Boeing the Atlas V), with the costs of those rocket launches included in the contracts.  The contract awards were announced in September 2014, several years later than the Obama administration had initially proposed due to lack of congressional funding.  The original contracts provided awards of up to $4.2 billion to Boeing and $2.6 billion to SpaceX, a discrepancy that reflected not that SpaceX would provide a lesser service, but rather that SpaceX offered in their contract bid a lower price.  Boeing was later granted an extra $287.2 million by NASA, in a decision that was criticized by the Office of the Inspector General of NASA, as Boeing (as well as SpaceX) had committed to provide the services agreed to under the contracts for the fixed, agreed upon, price.  Any cost overruns should then have been the responsibility of the contractor.  While Boeing argued it was not really a cost overrun under their contract, others (including the NASA Inspector General) disagreed.

Before the main contracts under the program had been approved, Boeing and SpaceX (along with others) had received smaller contracts to develop their proposals as well as to develop certain technologies that would be needed.  Including those earlier contracts (as well as the extra $287.2 million for Boeing), the total NASA would pay (provided milestones are reached) is $5,108.0 million to Boeing and $3,144.6 million to SpaceX.  For this, each contract provided that the new spacecraft would be developed and tested, with this then followed by six crewed flights of each to the ISS.  Thus the contracts include a combination of development and operational costs, which will be separated in the discussion below.

First, the development costs:

The estimates of the development costs for the SpaceX Crew Dragon and Boeing Starliner were made by subtracting, from the overall program costs, estimates made in the November 2019 report of the NASA Inspector General’s Office of the costs of the operational (flight) portions of the contracts.  Included in the development costs are the costs of the earlier contracts with SpaceX and Boeing to develop their proposals, as well as the extra $287.1 million that was later provided to Boeing.  Based on this, the total cost (to NASA) of supporting the development of the SpaceX Crew Dragon has been $1.845 billion, while the cost to NASA of the Boeing Starliner (assuming Boeing is ultimately successful in getting it to work properly) will be a bit over $3.0 billion.

The costs assume that the contractors will carry out their contractual commitments in full.  SpaceX so far has (the Crew Dragon is fully operational, and indeed SpaceX is now in its second operational flight, with a crew of four now at the ISS who are scheduled to return in November).  But Boeing has not.  As noted before, its initial unmanned test flight in December 2019 of the Boeing Starliner failed.  The planned re-try was on the pad in late July of this year and expected to fly within days when problems with stuck valves were discovered.  The Starliner had to be taken down and moved to a facility to identify the cause of the problem and fix it, with the flight not now expected until late this year at the earliest.  The extra costs are being borne by Boeing and have not been revealed, but in principle should be added to the $3,008 million cost figure in the chart above.  But they have been kept confidential, so we do not know what that addition would be.

In contrast to the cost to NASA of $1.845 billion for the SpaceX Crew Dragon and $3.0 billion to Boeing for its Starliner (under the new, competition-based, contractual approach), the amount NASA has spent on the Lockheed Orion spacecraft (under its traditional contractual approach) has been far higher.  More than $19.0 billion has already been spent through FY2021, and Orion is still in development.  Other than the early and partial test in 2014, the Orion has yet to be fully tested in flight.  The first such test is currently scheduled, along with the first test of the SLS, for later this year.  At best, it will not be operational until 2023, although more likely later.  Just adding what is anticipated will be needed to continue the development of Orion through FY2023, the total that NASA will spend on it will have reached $21.8 billion.  But the FY2023 cut-off date is in part arbitrary.  While the Orion capsule should be flying by then, there will still be additional expenditures to finalize its design and for further development.  These would add to the overall cost, but we do not know what those are expected to be.

Including costs just through FY2023, the cost of developing Orion is already close to 12 times what it has cost to develop the SpaceX Crew Dragon, and over 7 times what it has cost NASA to develop the Boeing Starliner.  While there are of course differences between the spacecraft, and it may be argued that the Orion is more capable, it is hard to see that such differences account for a cost that is 12 times that of the SpaceX Crew Dragon, or even 7 times the cost of the Boeing Starliner.  And as noted above, the greater capabilities of the Orion derive primarily from the European Service Module, whose costs are not included in the $21.8 billion figure for Orion.

The operational costs of the Orion will also be higher, using for comparability what it would be for a flight to earth orbit.  The most relevant figure is the cost per seat, and the calculations assume four seats will be filled on each flight (as NASA in fact plans, for both the missions to take astronauts to the ISS as well as for the Orion missions):

The costs include not only the cost of using the spacecraft itself, but also, and importantly, the cost of the rocket used to launch the spacecraft into orbit.  The costs of the rockets were included in the NASA contracts with SpaceX and Boeing, as the contracts were for the delivery of crews to the ISS.

The per-seat costs for the SpaceX Crew Dragon and Boeing Starliner contracts were calculated following the approach the NASA Inspector General used in its November 2019 report, using its estimate of the operational portion of the contracts with SpaceX and Boeing.  They come to $54.2 million per seat on the SpaceX Crew Dragon and $87.5 million on the Boeing Starliner (before rounding – in the Inspector General’s report one will see rounded figures of $55 million and $90 million, respectively).

The costs of building a new Orion capsule (which can then be reused to some degree) and flying it can be estimated from the announced NASA contracts with Lockheed for future missions.  In September 2019, NASA announced that it had awarded Lockheed an “Orion Production and Operations Contract”, where NASA would pay Lockheed for the Orion spacecraft for use on planned Artemis missions, but where the Orion spacecraft themselves would be reused to a varying degree that will rise over time.  The contracts for the Orions to be used in the first two flights (Artemis I and II) were signed some time before, and one can view these as part of the development costs (as these will be missions testing the Orion capsules).  The September 2019 announcement was that Lockheed would be paid a total of $2.7 billion for the next three missions (Artemis III, IV, and V), with re-use started to a limited degree.  Some high-value electronics, primarily, from the Orion used on the Artemis II mission would be re-used in the capsule for Artemis V.  Future costs would then fall further with greater re-use, but this should still be seen as speculative at this point.

Based on the $2.7 billion figure for the three Artemis missions following the first two, and with four seats on each of those three flights, the per-seat cost for the Orion alone would be $225 million.  To this one would need to add, for comparability, the cost of the rocket launcher.  The Artemis missions would use the SLS, which as discussed above, will cost $2.0 billion per flight.  This would add $500 million per seat (with the four seats per flight), bringing the total to $725 million per seat.

While that is indeed what the cost would be for the lunar missions, it is not an appropriate comparator to the costs of the Crew Dragon and Spaceliner capsules as the rockets they need are just for earth orbit.  For this reason, for the figure in the chart I have used the per-seat cost of what a launch on an Atlas V would be.  The Atlas V is the vehicle that will be used for the Boeing Starliner, and it has a comparable weight to the Orion (excluding the Orion European Service Module).  That per-seat cost, for a launch to earth orbit, would be $266.25 million.

Based on these figures, the operational cost per seat of an Orion capsule is almost 5 times what the per-seat cost is for the SpaceX Crew Dragon, and 3 times the cost on a Boeing Starliner.  These are huge differences.

F.  Conclusion

There was vehement opposition to Obama’s proposal to follow a more commercial approach to ferry crew to the ISS.  This came not only from former astronauts – who as pilots and engineers were taking a position on an issue they really did not know much about, but who were comfortable with the traditional approach.  Of more immediate importance, it came from certain politicians – in particular in the Senate.  The politicians opposed to the Obama proposals, led by Senator Shelby of Alabama, were also mostly (although not entirely) conservative Republican politicians who on other issues claimed to be in favor of free-market approaches.  Yet not here.

We now know that SpaceX delivered on the contracts, with now routine delivery of both cargo and crew to the ISS. Indeed, as I complete this post, an all-private crew of four have just returned from a three-day flight to earth orbit on a SpaceX Crew Dragon spacecraft (launched on a Falcon 9).  The flight was a complete success, and showed that flights of people to orbit are no longer restricted to a very small number of large nation-states (specifically Russia, the US, and China).  NASA certainly played an important role in supporting the development of the Falcon 9 and the Crew Dragon, as discussed above, but these flights are now private.  If Senator Shelby and his (mostly) Republican colleagues had gotten their way, this never would have happened.  The hope that this would follow was, however, an explicit part of the plan when the Obama administration proposed that NASA contract with private providers to bring crew to the ISS.  And it has.

Boeing is not yet at the point that SpaceX has reached, with its Starliner capsule still to be proven, but it appears likely that they will have worked through their problems by sometime next year (approximately three years after SpaceX succeeded with its first tests).  Meanwhile, even though work on the Orion spacecraft began in 2005 and work on the SLS began in 2011, both the SLS and Orion are still to be tested.  The SLS was supposed to be operational in 2016, but its first operational flight is now scheduled for 2023 and will almost certainly be later.  The key components of the SLS (the engines and the strap-on solid rocket boosters) were all taken from the Space Shuttle or even earlier designs.  It is not at all clear why this should have taken so long.

We also can now work out reasonable estimates of the costs, and can compare them to the costs under the more commercial approach.  In terms of the development costs (planned through FY2023), the SLS will cost an astonishing 65 times what it cost to develop the Falcon Heavy.  The SLS will be able to carry a heavier load, but only about 50% more than what a Falcon Heavy can carry.  It is difficult to see why this would cost 65 times as much.  And it is not just the SLS.  The cost of developing the Ares I, including what had been planned to be spent through FY2014 (when it still would not have been fully ready) would have been 42 times what the similarly sized Falcon 9 cost to develop.  These are mind-boggling high multiples.

The operational costs per launch are also high multiples of what the costs are for commercially developed rockets.  The cost of a launch of the SLS will be 22 times the cost of the recoverable Falcon Heavy.  While it can carry more, the cost per kilogram to low earth orbit will be 13 times higher for the SLS (excluding its development costs) compared to that for the recoverable version of the Falcon Heavy, and 9 times higher when compared to the expendable version.

Similarly, it is expected that development of the Orion capsule (not counting the cost of the Service Module, that the Europeans are developing as their contribution to the program) will by FY2023 have cost almost 12 times the cost of developing the SpaceX Crew Dragon.  And the operational cost per seat will be 5 times higher for the Orion than the cost for the Crew Dragon flights, and 3 times higher than for the Boeing Starliner.

The evidence is clear.  Why then, are the conservative Republican Senators and Members of Congress (as well as a few Democrats, including, significantly, Representative Eddie Bernice Johson, D-Texas, who is the current Chair of the House Committee on Science, Space, and Technology) so opposed to NASA entering into commercial contracts with SpaceX and others?  The answer, clearly, is the politics of it.  Spending billions of dollars on such hardware keeps many employed, and many of those jobs are in high-wage engineering and technical positions.  From this perspective, the high costs are not a flaw but a feature.

This is not only a waste.  Since budgets are not unlimited such waste has also meant long delays in achieving the intended goals.  The space program has traditionally enjoyed much goodwill in the general population.  But such waste, as well as the resulting long delays in achieving the intended aims, could destroy that goodwill.

That would be unfortunate, although not the end of the world.  One does, however, see the same issues with the military budget, where the stakes are higher.  And the costs are also much higher, with major military programs now costing in the hundreds of billions of dollars rather than the tens of billions for the space program.  An example has been the development of the F-35 fighter jet.  The program began in 1992, the first prototypes (of Lockheed and Boeing) flew in 2000, Lockheed won the contract in 2001, the first planes were manufactured in 2011, and the first squadron became operational in 2015.  That is, it took 23 years to go from the initial design and conceptual work to the first operational unit.  Furthermore, it is expected to be the most expensive military program in history, with over $400 billion expected to be spent to acquire the planes and a further $1.1 trillion to keep them operational over a 50-year life cycle for the program.  That is a total cost of $1.5 trillion, and other estimates place the cost at $1.6 or even $1.7 trillion (and no one will know for sure what it will be until this is all history).

The factors driving such high costs as well multi-decade time frames to go from concept to operations are undoubtedly similar to those that have driven this for major NASA programs such as the Orion and the SLS.  Spending more is politically attractive to those politicians that represent the states and districts where the spending will be done.  But for the military, the stakes (and not simply the dollar amounts) are a good deal higher than they are for the space program.

But it should also be recognized that the cure for this is likely to be more complicated and difficult than what NASA has been able to achieve through changes to its traditional contracting and procurement model.  Industry capabilities will need to be developed, with greater competition introduced.  In major areas there are now often only two or three manufacturers, and sometimes only one, with the capabilities required.

We do, however, now have examples of what can be done.  ULA (United Launch Alliance) had a monopoly on heavy-lift launch vehicles following its creation in 2006 by combining what had been the competing launch divisions of Boeing and Lockheed.  SpaceX entered that market, and we saw above what resulted.  If such progress is possible with something as complex as a heavy-lift rocket, it should be possible in at least some other areas of military procurement as well.

 

=======================================================

Annex:  Why Cost Comparisons of Rockets and Spacecraft are Difficult to Make

One might think that comparisons of costs of rockets as well as spacecraft would be straightforward.  But they are not, for a number of reasons:

a)  First, different sources will often provide different estimates.  There is no single, authoritative, source that one can cite, and one will often see differing estimates in different sources.  Recognizing this, for the purposes here – which are to compare the costs where there is competition (primarily SpaceX) to the costs under NASA procurement from the traditional contractors – I have sought to use estimates that are on the high side of what has been published for the costs of the SpaceX vehicles, and on the low side for the costs of the traditional NASA contractors.  Despite this, the SpaceX costs are still far lower.

b)  An important reason there are these different cost estimates from different sources is that the information on what the costs actually are have often been kept confidential.  SpaceX is the most transparent, but even here what they publish on their website ($62 million for a Falcon 9 launch, and $90 million for a Falcon Heavy) should really be viewed more as a “list price” that will be negotiated.  For NASA, full transparency on the costs can be embarrassing.  For commercial providers, less-than-transparent cost figures may be seen as helpful when they engage in negotiations with those who would purchase their services.

c)  Which brings us to a third factor, which is negotiating power.  Just like when buying a car, the price that will be paid will depend in part on the relative negotiating powers of the parties.  When there is a low-cost competing supplier (such as SpaceX), there will be pressure on higher-cost suppliers to lower their prices.  One has seen this with the prices being charged for launches of the Delta IV Heavy and Atlas V rockets.  Negotiating power will also depend on whether one will be a repeat customer or just a one-time user.  For these reasons there will not be one, unique, price that can be cited as the “cost” of launching a particular rocket.  It will depend on the negotiations.

d)  And this also leads to the distinction between the cost of a rocket launch and the price charged.  Ideally, what one wants as the basis for comparison is the cost.  However, the best information available will often be the price that some customer paid.  But that price may include a substantial profit margin if that customer did not have much negotiating power to bargain down the price.  It might also work the other way.  The cost of developing and launching the Boeing Starliner capsule, which was discussed above, is based on what NASA is paying.  Yet because of the repeated problems with the development of the Starliner, Boeing is certainly losing money on that fixed-price contract.  How much Boeing is losing has not been disclosed, and indeed since there are continued problems they do not yet know themselves how much it will have cost in the end.  Hence, in a comparison of the cost of delivering astronauts to the ISS the true cost of the Boeing Starliner will be something more than what NASA is paying, and it is that higher cost which really should be the basis for comparison with the cost of the SpaceX Crew Dragon alternative.

e)  The common basis for comparison is also inherently problematic.  While the standard measure for a rocket (and the one used here) is how many kilograms of payload can be lifted to low earth orbit, specific situations are more complicated.  Depending on the mission, one will want to place the payload into different types of orbits, including different altitudes (from 100 miles to several hundred miles, and still be considered “low”), different angles to the equator (the higher the angle, the higher the share of the world’s land area that would be covered by the satellite over some period, such as a month), and perhaps different requirements on how circular the orbit needs to be (the difference between the highest point in the orbit and the lowest).  There will be different thrust (hence fuel) requirements for each of these, possibly different payload weights that can be handled, and possibly other differences, all of which would end up being reflected in the negotiated price for the launch.

f)  Different payloads also have different requirements on how they must be handled, how they need to be attached to the rocket, the requirements on the fairings (the nose cone shell surrounding the payload to protect it at launch, which is jettisoned once orbital altitude is reached), and so on.  Military launches are also more expensive (and charged accordingly) due to the secrecy arrangements the Defense Department requires.

g)  Different boosters will also have different capabilities.  For a launch into low earth orbit these capabilities might not all be needed, but they may still be reflected in the costs.  The most obvious is the size of the payload.  If the weight is more than a smaller rocket can handle (and the payload cannot be divided into two or more smaller satellites), then they will have to use a larger booster even if the cost per kilogram is higher.

h)  The calculations of the cost per kilogram of payload are also based on the maximum payload each rocket can handle.  But it would be coincidental that any particular payload will be exactly at this maximum weight.  The cost per kilogram will then be higher for a payload that weighs less than this maximum.  While there may be some savings in total costs in launching a payload that is less than the maximum a rocket can handle (somewhat less fuel will be needed, for example), such savings will be modest.  For this reason, SpaceX and others will typically offer to sell, at a low price, such extra space to those with just small satellites, piggy-backing on larger satellites that do not need to use up the full payload capacity of the rocket.  The entity with the larger satellite might then receive a discount from what the cost otherwise would have been.

i)  Finally, one should recognize that there are normally several variants of each launch vehicle, with somewhat different capabilities and costs.  To the extent possible, all the cost estimates in this post are for a single, recent, variant of the vehicles.  The Falcon 9 launch vehicle, for example, is now at what they have named the “Block 5” variant, and the costs of that version are what have been used here.  Earlier versions of the Falcon 9 were labeled v1.0, v1.1, v1.2 or “Full Thrust” (and sometimes referred to as Block 3), and Block 4.

 

There are therefore a number of reasons why one needs to be cautious in judging reported cost differences between various rockets as well as spacecraft.  As noted in the text, cost differences of 10 or 20% certainly, and indeed even 40 or 50%, should not be seen as necessarily significant.  But as the charts show, the cost differences are far higher than this, with the costs of the traditional contractors following the traditional NASA procurement processes many times the costs obtained under the more competitive process the Obama administration introduced to manned space flight (at substantial political cost).

The Ridership Forecasts for the Baltimore-Washington SCMAGLEV Are Far Too High

The United States desperately needs better public transit.  While the lockdowns made necessary by the spread of the virus that causes Covid-19 led to sharp declines in transit use in 2020, with (so far) only a partial recovery, there will remain a need for transit to provide decent basic service in our metropolitan regions.  Lower-income workers are especially dependent on public transit, and many of them are, as we now see, the “essential workers” that society needs to function.  The Washington-Baltimore region is no exception.

Yet rather than focus on the basic nuts and bolts of ensuring quality services on our subways, buses, and trains, the State of Maryland is once again enamored with using the scarce resources available for public transit to build rail lines through our public parkland in order to serve a small elite.  The Purple Line light rail line was such a case.  Its dual rail lines will serve a narrow 16-mile corridor, passing through some of the richest zip codes in the nation, but destroying precious urban parkland.  As was discussed in an earlier post on this blog, with what will be spent on the Purple Line one could instead stop charging fares on the county-run bus services in the entirety of the two counties the Purple Line will pass through (Montgomery and Prince George’s), and at the same time double those bus services (i.e. double the lines, or double the service frequency, or some combination).

The administration of Governor Hogan of Maryland nonetheless pushed the Purple Line through, although construction has now been halted for close to a year due to cost overruns leading the primary construction contractor to withdraw.  Hogan’s administration is now promoting the building of a superconducting, magnetically-levitating, train (SCMAGLEV) between downtown Baltimore and downtown Washington, DC, with a stop at BWI Airport.  Over $35 million has already been spent, with a massive Draft Environmental Impact Statement (DEIS) produced.  As required by federal law, the DEIS has been made available for public comment, with comments due by May 24.

It is inevitable that such a project will lead to major, and permanent, environmental damage.  The SCMAGLEV would travel partially in tunnels underground, but also on elevated pylons parallel to the Baltimore-Washington Parkway (administered by the National Park Service).  The photos at the top of this post show what it would look like at one section of the parkway.  The question that needs to be addressed is whether any benefits will outweigh the costs (both environmental and other costs), and ridership is central to this.  If ridership is likely to be well less than that forecast, the whole case for the project collapses.  It will not cover its operating and maintenance costs, much less pay back even a portion of what will be spent to build it (up to $17 billion according to the DEIS, but likely to be far more based on experience with similar projects).  Nor would the purported economic benefits then follow.

I have copied below comments I submitted on the DEIS forecasts.  Readers may find them of interest as this project illustrates once again that despite millions of dollars being spent, the consulting firms producing such analyses can get some very basic things wrong.  The issue I focus on for the proposed SCMAGLEV is the ridership forecasts.  The SCMAGLEV project sponsors forecast that the SCMAGLEV will carry 24.9 million riders (one-way trips) in 2045.  The SCMAGLEV will require just 15 minutes to travel between downtown Baltimore and downtown Washington (with a stop at BWI), and is expected to charge a fare of $120 (roundtrip) on average and up to $160 at peak hours.  As one can already see from the fares, at best it would serve a narrow elite.

But there is already a high-speed train providing premier-level service between Baltimore and Washington – the Acela service of Amtrak.  It takes somewhat longer – 30 minutes currently – but its fare is also somewhat lower at $104 for a roundtrip, plus it operates from more convenient stations in Baltimore and Washington.  Importantly, it operates now, and we thus have a sound basis for forecasts of what its ridership might be in the future.

One can thus compare the forecast ridership on the proposed SCMAGLEV to the forecast for Acela ridership (also in the DEIS) in a scenario of no SCMAGLEV.  One would expect the forecasts to be broadly comparable.  One could allow that perhaps it might be somewhat higher on the SCMAGLEV, but probably less than twice as high and certainly less than three times as high.  But one can calculate from figures in the DEIS that the forecast SCMAGLEV ridership in 2045 would be 133 times higher than what they forecast Acela ridership would be in that year (in a scenario of no SCMAGLEV).  For those going just between downtown Baltimore and downtown Washington (i.e. excluding BWI travelers), the forecast SCMAGLEV ridership would be 154 times higher than what it would be on the comparable Acela.  This is absurd.

And it gets worse.  For reasons that are not clear, the base year figures for Acela ridership in the Baltimore-Washington market are more than eight times higher in the DEIS than figures that Amtrak itself has produced.  It is possible that the SCMAGLEV analysts included Acela riders who have boarded north of Baltimore (such as in Philadelphia or New York) and then traveled through to DC (or from DC would pass through Baltimore to ultimate destinations further north).  But such travelers should not be included, as the relevant travelers who might take the SCMAGLEV would only be those whose trips begin in either Baltimore or in Washington and end in the other metropolitan area.  The project sponsors have made no secret that they hope eventually to build a SCMAGLEV line the full distance between Washington and New York, but that would at a minimum be in the distant future.  It is not a source of riders included in their forecasts for a Baltimore to Washington SCMAGLEV.

The Amtrak forecasts of what it expects its Acela ridership would be, by market (including between Baltimore and Washington) and under various investment scenarios, come from its recent NEC FUTURE (for Northeast Corridor Future) study, for which it produced a Final Environmental Impact Statement.  Using Amtrak’s forecasts of what its Acela ridership would be in a scenario where major investments allowed the Acela to take just 20 minutes to go between Baltimore and Washington, the SCMAGLEV ridership forecasts were 727 times as high (in 2040).  That is complete nonsense.

My comment submitted on the DEIS, copied below, goes further into these results and discusses as well how the SCMAGLEV sponsors could have gotten their forecasts so absurdly wrong.  But the lesson here is that the consultants producing such forecasts are paid by project sponsors who wish to see the project built.  Thus they have little interest in even asking the question of why they have come up with an estimate that 24.9 million would take a SCMAGLEV in 2045 (requiring 15 minutes on the train itself to go between Baltimore and DC) while ridership on the Acela in that year (in a scenario where the Acela would require 5 minutes more, i.e. 20 minutes, and there is no SCMAGLEV) would be about just 34,000.

One saw similar issues with the Purple Line.  An examination of the ridership forecasts made for it found that in about half of the transit analysis zone pairs, the predicted ridership on all forms of public transit (buses, trains, and the Purple Line as well) was less than what they forecast it would be on the Purple Line only.  This is mathematically impossible.  And the fact that half were higher and half were lower suggests that the results they obtained were basically just random.  They also forecast that close to 20,000 would travel by the Purple Line into Bethesda each day but only about 10,000 would leave (which would lead to Bethesda’s population exploding, if true).  The source of this error was clear (they mixed up two formats for the trips – what is called the production/attraction format with origin/destination), but it mattered.  They concluded that the Purple Line had to be a rail line rather than a bus service in order to handle their predicted 20,000 riders each day on the segment to Bethesda.

It may not be surprising that private promoters of such projects would overlook such issues.  They may stand to gain (i.e. from the construction contracts, or from an increase in land values next to station sites), even though society as a whole loses.  Someone else (government) is paying.  But public officials in agencies such as the Maryland Department of Transportation should be looking at what is the best way to ensure quality and affordable transit services for the general public.  Problems develop once the officials see their role as promoters of some specific project.  They then seek to come up with a rationale to justify the project, and see their role as surmounting all the hurdles encountered along the way.  They are not asking whether this is the best use of scarce public resources to address our very real transit needs.

A high-speed magnetically-levitating train (with superconducting magnets, no less), may look attractive.  But officials should not assume such a shiny new toy will address our transit issues.

—————————————————————————————————

May 22, 2021

Comment Submitted on the DEIS for SCMAGLEV

The Ridership Forecasts Are Far Too High

A.  Introduction

I am opposed to the construction of the proposed SCMAGLEV project between Baltimore and Washington, DC.  A key issue for any such system is whether ridership will be high enough to compensate for the environmental damage that is inevitable with such a project.  But the ridership forecasts presented in the DEIS are hugely flawed.  They are far too high and simply do not meet basic conditions of plausibility.  At more plausible ridership levels, the case for such a project collapses.  It will not cover its operating costs, much less pay back any of the investment (of up to $17 billion according to the DEIS, but based on experience likely to be far higher).  Nor will the purported positive economic benefits then follow.  But the damage to the environment will be permanent.

Specifically, there is rail service now between Baltimore and Washington, at three levels of service (the high-speed Acela service of Amtrak, the regular Amtrak Regional service, and MARC).  Ridership on the Acela service, as it is now and with what is expected with upgrades in future years, provides a benchmark that can be used.  While it could be argued that ridership on the proposed SCMAGLEV would be higher than ridership on the Acela trains, the question is how much higher.  I will discuss below in more detail the factors to take into account in making such a comparison, but briefly, the Acela service takes 30 minutes today to go between Baltimore and Washington, while the SCMAGLEV would take 15 minutes.  But given that it also takes time to get to the station and on the train, and then to the ultimate destination at the other end, the time savings would be well less than 50%.  The fare would also be higher on the SCMAGLEV (at an average, according to the DEIS, of $120 for a round-trip ticket but up to $160 at peak hours, versus an average of $104 on the Acela).  In addition, the stations the SCMAGLEV would use for travel between downtown Baltimore and downtown Washington are less conveniently located (with poorer connections to local transit) than the Acela uses.

Thus while it could be argued that the SCMAGLEV would attract more riders than the Acela, even this is not clear.  But being generous, one could allow that it might attract somewhat more riders.  The question is how many.  And this is where it becomes completely implausible.  Based on the ridership forecasts in the DEIS, for both the SCMAGLEV and for the Acela (in a scenario where the SCMAGLEV is not built), the SCMAGLEV in 2045 would carry 133 times what ridership would be on the Acela.  Excluding the BWI ridership on both, it would be 154 times higher.  There is no way to describe this other than that it is just nonsense.  And with other, likely more accurate, forecasts of what Acela ridership would be in the future (discussed below) the ratios become higher still.

Similarly, if the SCMAGLEV will be as attractive to MARC riders as the project sponsors forecast it will be, then most of those MARC riders would now be on the modestly less attractive Acela.  But they aren’t.  The Acela is 30 minutes faster than MARC (the SCMAGLEV would be 45 minutes faster), yet 28 times as many riders choose MARC over Acela between Baltimore and Washington.  I suspect the fare difference ($16 per day on MARC, vs. $104 on the Acela) plays an important role.  The model used could have been tested by calculating a forecast with their model of what Acela ridership would be under current conditions, with this then compared this to what the actual figures are.  Evidently this was not done.  Had they, their predicted Acela ridership would likely have been a high multiple of the actual and it would have been clear that their modeling framework has problems.

Why are the forecasts off by orders of magnitude?  Unfortunately, given what has been made available in the DEIS and with the accompanying papers on ridership, one cannot say for sure.  But from what has been made available, there are indications of where the modeling approach taken had issues.  I will discuss these below.

In the rest of this comment I will first discuss the use of Acela service and its ridership (both the actual now and as projected) as a basis for comparison to the ridership forecasts made for the SCMAGLEV.  They would be basically similar services, where a modest time saving on the SCMAGLEV (15 minutes now, but only 5 minutes in the future if further investments are made in the Acela service that would cut its Baltimore to DC time to just 20 minutes) is offset by a higher fare and less convenient station locations.  I will then discuss some reasons that might explain why the SCMAGLEV ridership forecasts are so hugely out-of-line with what plausible numbers might be.

B.  A Comparison of SCMAGLEV Ridership Forecasts to Those for Acela  

The DEIS provides ridership forecasts for the SCMAGLEV for both 2030 (several years after the DEIS says it would be opened, so ridership would then be stable after an initial ramping up) and for a horizon year of 2045.  I will focus here on the 2045 forecasts, and specifically on the alternative where the destination station in Baltimore is Camden Yards.  The DEIS also has forecasts for ridership in an alternative where the SCMAGLEV line would end in the less convenient Cherry Hill neighborhood of Baltimore, which is significantly further from downtown and with poorer connections to local transit options.  The Camden Yards station is more comparable to Penn Station – Baltimore, which the Acela (and Amtrak Regional trains and one of the MARC lines) use.  Penn Station – Baltimore has better local transit connections and would be more convenient for many potential riders, but this will of course depend on the particular circumstances of the rider – where he or she will be starting from and where their particular destination will be.  It will, in particular, be more convenient for riders coming from North and Northeast of Baltimore than Camden Yards would be.  And those from South and Southwest of Baltimore would be more likely to drive directly to the DC region than try to reach Camden Yards, or they would alight at BWI.

The DEIS also provides forecasts of what ridership would be on the existing train services between Baltimore and Washington:  the Acela services (operated by Amtrak), the regular Amtrak Regional trains, and the MARC commuter service operated by the State of Maryland.  Note also that the 2045 forecasts for the train services are for both a scenario where the SCMAGLEV is not built and then what they forecast the reduced ridership would be with a SCMAGLEV option.  For the purposes here, what is of interest is the scenario with no SCMAGLEV.

The SCMAGLEV would provide a premium service, requiring 15 minutes to go between downtown Baltimore and downtown Washington, DC.  Acela also provides a premium service and currently takes 30 minutes, while the regular Amtrak Regional trains take 40 to 45 minutes and MARC service takes 60 minutes.  But the fares differ substantially.  Using the DEIS figures (with all prices and fares expressed in base year 2018 dollars), the SCMAGLEV would charge an average fare of $120 for a round-trip (Baltimore-Washington), and up to $160 for a roundtrip at peak times.  The Acela also has a high fare for its also premium service, although not as high as SCMAGLEV, charging an average of $104 for a roundtrip (using the DEIS figures).  But Amtrak Regional trains charge only $34 for a similar roundtrip, and MARC only $16.

Acela service thus provides a reasonable basis for comparison to what SCMAGLEV would provide, with the great advantage that we know now what Acela ridership has actually been.  This provides a firm base for a forecast of what Acela ridership would be in a future year in a scenario where the SCMAGLEV is not built.  And while the ridership on the two would not be exactly the same, one should expect them to be in the same ballpark.

But they are far from that:

  DEIS Forecasts of SCMAGLEV vs. Acela Ridership, Annual Trips in 2045

Route

SCMAGLEV Trips

Acela Trips

Ratio

Baltimore – DC only

19,277,578

125,226

154 times as much

All, including BWI

24,938,652

187,887

133 times as much

Sources:  DEIS, Main Report Table 4.2-3; and Table D-4-48 of Appendix D.4 of the DEIS

Using estimates just from the DEIS, the project sponsor is forecasting that annual (one-way) trips on the SCMAGLEV in 2045 would be 133 times what they would be in that year on the Acela (in a scenario where the SCMAGLEV is not built).  And it would be 154 times as much for the Baltimore – Washington riders only.  This is nonsense.  One could have a reasonable debate if the SCMAGLEV figures were twice as high, and maybe even if they were three times as high.  But it is absurd that they would be 133 or 154 times as high.

And it gets worse.  The figures above are all taken from the DEIS.  But the base year Acela ridership figures in the DEIS (Appendix D.4, Table D.4-45) differ substantially from figures Amtrak itself has produced in its recent NEC FUTURE study.  This review of future investment options in Northeast Corridor (Washington to Boston) Amtrak service was concluded in July 2017.  As part of this it provided forecasts of what future Acela ridership would be under various alternatives, including one (its Alternative 3) where Acela trains would be substantially upgraded and require just 20 minutes for the trip between downtown Baltimore and downtown Washington, DC.  This would be quite similar to what SCMAGLEV service would be.

But for reasons that are not clear, the base year figures for Acela ridership between Baltimore and Washington differ substantially between what the SCMAGLEV DEIS has and what NEC FUTURE has.  The figure in the NEC FUTURE study (for a base year of 2013) puts the number of riders (one-way) between Baltimore and Washington (and not counting those who boarded north of Baltimore, at Philadelphia or New York for example, and then rode through to Washington, and similarly for those going from Washington to Baltimore) at just 17,595.  The DEIS for the SCMAGLEV put the similar Acela ridership (for a base year of 2017) at 147,831 (calculated from Table D.4-45, of Appendix D.4).  While the base years differ (2013 vs. 2017), the disparity cannot be explained by that.  It is far too large.  My guess would be that the DEIS counted all Acela travelers taking up seats between Baltimore and Washington, including those who alighted north of Baltimore (or whose destination from Washington was north of Baltimore), and not just those travelers traveling solely between Washington and Baltimore.  But the SCMAGLEV will be serving only the Baltimore-Washington market, with no interconnections with the train routes coming from north of Baltimore.

What was the source of the Acela ridership figure in the DEIS of 147,831 in 2017?  That is not clear.  Table D.4-45 of Appendix D.4 says that its source is Table 3-10 of the “SCMAGLEV Final Ridership Report”, dated November 8, 2018.  But that report, which is available along with the other DEIS reports (with a direct link at https://bwmaglev.info/index.php/component/jdownloads/?task=download.send&id=71&catid=6&m=0&Itemid=101), does not have a Table 3-10.  Significant portions of that report were redacted, but in its Table of Contents no reference is shown to a Table 3-10 (even though other redacted tables, such as Tables 5-2 and 6-3, are still referenced in the Table of Contents, but labeled as redacted).

One can only speculate on why there is no Table 3-10 in the Final Ridership Report.  Perhaps it was deleted when someone discovered that the figures reported there, which were then later used as part of the database for the ridership forecast models, were grossly out of line with the Amtrak figures.  The Amtrak figure for Acela ridership for Baltimore-Washington passengers of 17,595 (in 2013) is less than one-eighth of the figure on Acela ridership shown in the DEIS or 147,831 (in 2017).

It can be difficult for an outsider to know how many of those riding on the Acela between Washington and Baltimore are passengers going just between those two cities (as well as BWI).  Most of the passengers riding on that segment will be going on to (or coming from) cities further north.  One would need access to ticket sales data.  But it is reasonable to assume that Amtrak itself would know this, and therefore that the figures in the NEC FUTURE study would likely be accurate.  Furthermore, in the forecast horizon years, where Amtrak is trying to show what Acela (and other rail) ridership would grow to with alternative investment programs, it is reasonable to assume that Amtrak would provide relatively optimistic (i.e. higher) estimates, as higher estimates are more likely to convince Congress to provide the funding that would be required for such investments.

The Amtrak figures would in any case provide a suitable comparison to what SCMAGLEV’s future ridership might be.  The Amtrak forecasts are for 2040, so for the SCMAGLEV forecasts I interpolated to produce an estimate for 2040 assuming a constant rate of growth between the forecast SCMAGLEV ridership in 2030 and that for 2045.  Both the NEC FUTURE and SCMAGLEV figures include the stop at BWI.

    Forecasts of SCMAGLEV (DEIS) vs. Acela (NEC FUTURE) Ridership between Baltimore and Washington, Annual Trips in 2040 

Alternative

SCMAGLEV Trips

Acela Trips

Ratio

No Action

22,761,428

26,177

870 times as much

Alternative 1

22,761,428

26,779

850 times as much

Alternative 2

22,761,428

29,170

780 times as much

Alternative 3

22,761,428

31,291

727 times as much

Sources:  SCMAGLEV trips interpolated from figures on forecast ridership in 2030 and 2045 (Camden Yards) in Table 4.2-3 of DEIS.  Acela trips from NEC FUTURE Final EIS, Volume 2, Appendix B.08.

The Acela ridership figures are those estimated under various investment scenarios in the rail service in the Northeast Corridor.  NEC FUTURE examined a “No Action” scenario with just minimal investments, and then various alternative investment levels to produce increasingly capable services.  Alternative 3 (of which there were four sub-variants, but all addressing alternative investments between New York and Boston and thus not affecting directly the Washington-Baltimore route) would upgrade Acela service to the extent that it would go between Baltimore and Washington in just 20 minutes.  This would be very close to the 15 minutes for the SCMAGLEV.  Yet even with such a comparable service, the SCMAGLEV DEIS is forecasting that its service would carry 727 times as many riders as what Amtrak has forecast for its Acela service (in a scenario where there is no SCMAGLEV).  This is complete nonsense.

To be clear, I would stress again that the forecast future Acela ridership figures are a scenario under various possible investment programs by Amtrak.  The investment program in Alternative 3 would upgrade Acela service to a degree where the Baltimore – Washington trip (with a stop at BWI) would take just 20 minutes.  The NEC FUTURE study forecasts that in such a scenario the Baltimore-Washington ridership on Acela would total a bit over 31,000 trips in the year 2040.  In contrast, the DEIS for the SCMAGLEV forecasts that there would in that year be close to 23 million trips taken on the similar SCMAGLEV service, requiring 15 minutes to make such a trip.  Such a disparity makes no sense.

C.  How Could the Forecasts be so Wrong?

A well-known consulting firm, Louis Berger, prepared the ridership forecasts, and their “Final Ridership Report” dated November 8, 2018, referenced above, provides an overview on the approach they took.  Unfortunately, while I appreciate that the project sponsor provided a link to this report along with the rest of the DEIS (I had asked for this, having seen references to it in the DEIS), the report that was posted had significant sections redacted.  Due to those redactions, and possibly also limitations in what the full report itself might have included (such as summaries of the underlying data), it is impossible to say for sure why the forecasts of SCMAGLEV ridership were close to three orders of magnitude greater than what ridership has been and is expected to be on comparable Acela service.

Thus I can only speculate.  But there are several indications of what may have led the SCMAGLEV estimates to be so out of line with ridership on a service that is at least broadly comparable.  Specifically:

1)  As noted above, there were apparent problems in assembling existing data on rail ridership for the Baltimore-Washington market, in particular for the Acela.  The ridership numbers for the Acela in the DEIS were more than eight times higher in their base year (2017) than what Amtrak had in an only slightly earlier base year (2013).  The ridership numbers on Amtrak Regional trains (for Baltimore-Washington riders) were closer but still substantially different:  409,671 in Table D.4-45 of the DEIS (for 2017), vs. 172,151 in NEC FUTURE (for 2013).

Table D.4-45 states that its source for this data on rail ridership is a Table 3-10 in the Final Ridership Report of November 8, 2018.  But as noted previously, such a table is not there – it was either never there or it was redacted.  Thus it is impossible to determine why their figures differ so much from those of Amtrak.  But the differences for the Acela figures (more than a factor of eight) are huge, i.e. close to an order of magnitude by itself.  While it is impossible to say for sure, my guess (as noted above) is that the Acela ridership numbers in the DEIS included travelers whose trip began, or would end, in destinations north of Baltimore, who then traveled through Baltimore on their way to, or from, Washington, DC.  But such travelers are not part of the market the SCMAGLEV would serve.

2)  In modeling the choice those traveling between Baltimore and Washington would have between SCMAGLEV and alternatives, the analysts collapsed all the train options (Acela, Amtrak Regional, and MARC) into one.  See page 61 of the Ridership Report.  They create a weighted average for a single “train” alternative, and they note that since (in their figures) MARC ridership makes up almost 90% of the rail market, the weighted averages for travel time and the fare will be essentially that of MARC.

Thus they never looked at Acela as an alternative, with a service level not far from that of SCMAGLEV.  Nor do they even consider the question of why so many MARC riders (67.5% of MARC riders in 2045 if the Camden Yards option is chosen – see page D-56 of Appendix D-4 of the DEIS) are forecast to divert to the SCMAGLEV, but are not doing so now (nor in the future) to Acela.  According to Table D-45 of Appendix D.4 of the DEIS, in their data for their 2017 base year, there are 28 times as many MARC riders as on Acela between downtown Baltimore and downtown Washington, and 20 times as many with those going to and from the BWI stop included.  Evidently, they do not find the Acela option attractive.  Why should they then find the SCMAGLEV train attractive?

3)  The answer as to why MARC riders have not chosen to ride on the Acela almost certainly has something to do with the difference in the fares.  A round-trip on MARC costs $16 a day.  A round trip on Acela costs, according to the DEIS, an average of $104 a day.  That is not a small difference.  For someone commuting 5 days a week and 50 weeks a year (or 250 days a year), the annual cost on MARC would be $4,000 but $26,000 a year on the Acela.  And it would be an even higher $30,000 a year on the SCMAGLEV (based on an average fare of $120 for a round trip), and $40,000 a year ($160 a day) at peak hours (which would cover the times commuters would normally use).  Even for those moderately well off, $40,000 a year for commuting would be a significant expense, and not an attractive alternative to MARC with its cost of just one-tenth of this.

If such costs were properly taken into account in the forecasting model, why did it nonetheless predict that most MARC riders would switch to the SCMAGLEV?  This is not fully clear as the model details were not presented in the redacted report, but note that the modelers assigned high dollar amounts for the time value of money ($31.00 to $46.50 for commuters and other non-business travel, and $50.60 to $75.80 for business travel – see page 53 of the Ridership Report).  However, even at such high values, the numbers do not appear to be consistent.  Taking a SCMAGLEV (15 minute trip) rather than MARC (60 minutes) would save 45 minutes each way or 1 1/2 hours a day.  Only at the very high end value of time for business travelers (of $75.80 per hour, or $113.70 for 1 1/2 hours) would this value of time offset the fare difference of $104 (using the average SCMAGLEV fare of $120 minus the MARC fare of $16).  And even that would not suffice for travelers at peak hours (with its SCMAGLEV fare of $160).

But there is also a more basic problem.  It is wrong to assume that travelers on MARC treat their 60 minutes on the train as all wasted time.  They can read, do some work, check their emails, get some sleep, or plan their day.  The presumption that they would pay amounts similar to what some might on average earn in an hour based on their annual salaries is simply incorrect.  And as noted above, if it were correct, then one would see many more riders on the Acela than one does (and similarly riders on the Amtrak Regional trains, that require about 40 minutes for the Washington to Baltimore trip, with an average fare of $34 for a round trip).

There is a similar issue for those who drive.  Those who drive do not place a value on the time spent in their cars equal to what they would earn in an hourly equivalent of their regular salary.  They may well want to avoid traffic jams, which are stressful and frustrating for other reasons, but numerous studies have found that a simple value-of-time calculation based on annual salaries does not explain why so many commuters choose to drive.

4)  Data for the forecasting model also came in part from two personal surveys.  One was an in-person survey of travelers encountered on MARC, at either the MARC BWI Station or onboard Penn Line trains, or at BWI airport.  The other was an online internet survey, where they unfortunately redacted out how they chose possible respondents.

But such surveys are unreliable, with answers that depend critically on how the questions are phrased.  The Final Ridership report does not include the questionnaire itself (most such reports would), so one cannot know what bias there might have been in how the questions were worded.  As an example (and admittedly an exaggerated example, to make the point) were the MARC riders simply asked whether they would prefer a much faster, 15 minute, trip?  Or were they asked whether they would pay an extra $104 per day ($144 at peak hours) to ride a service that would save them 45 minutes each way on the train?

But even such willingness to pay questions are notoriously unreliable.  An appropriate follow-up question to a MARC rider saying they would be willing to pay up to an extra $144 a day to ride a SCMAGLEV, would be why are they evidently not now riding the Acela (at an extra $88 a day) for a ride just 15 minutes longer than what it would be on the SCMAGLEV.

One therefore has to be careful in interpreting and using the results from such a survey in forecasting how travelers would behave.  If current choices (e.g. using the MARC rather than the Acela) do not reflect the responses provided, one should be concerned.

5)  Finally, the particular mathematical form used to model the choices the future travelers would make can make a big difference to the findings.  The Final Ridership Report briefly explains (page 53) that it used a multinomial logit model as the basis for its modeling.  Logit functions assign a continuous probability (starting from 0 and rising to 100%) of some event occurring.  In this model, the event is that a traveler going from one travel zone to another will choose to travel via the SCMAGLEV, or not.  The likelihood of choosing to travel via the SCMAGLEV will be depicted as an S-shaped function, starting at zero and then smoothly rising (following the S-shape) until it reaches 100%, depending on, among other factors, what the travel time savings might be.

The results that such a model will predict will depend critically, of course, on the particular parameters chosen.  But the heavily redacted Final Ridership Report does not show what those parameters were nor how they were chosen or possibly estimated, nor even the complete set of variables used in that function.  The report says little (in what remains after the redactions) beyond that they used that functional form.

A feature of such logit models is that while the choices are discrete (one either will ride the SCMAGLEV or will not), it allows for “fuzziness” around the turning points, that recognize that between individuals, even if they confront a similar combination of variables (a combination of cost, travel time, and other measured attributes), some will simply prefer to drive while some will prefer to take the train.  That is how people are.  But then, while a higher share might prefer to take a train (or the SCMAGLEV) when travel times fall (by close to 45 minutes with the SCMAGLEV when compared to their single “train” option that is 90% MARC, and by variable amounts for those who drive depending on the travel zone pairs), how much higher that share will be will depend on the parameters they selected for their logit.

With certain parameters, the responses can be sensitive to even small reductions in travel times, and the predicted resulting shifts then large.  But are those parameters reasonable?  As noted previously, a test would have been whether the model, with the parameters chosen, would have predicted accurately the number of riders actually observed on the Acela trains in the base year.  But it does not appear such a test was done.  At least no such results were reported to test whether the model was validated or not.

Thus there are a number of possible reasons why the forecast ridership on the SCMAGLEV differs so much from what one currently observes for ridership on the Acela, and from what one might reasonably expect Acela ridership to be in the future.  It is not possible to say whether these are indeed the reasons why the SCMAGLEV forecasts are so incredibly out of line with what one observes for the Acela.  There may be, and indeed likely are, other reasons as well.  But due to issues such as those outlined here, one can understand the possible factors behind SCMAGLEV ridership forecasts that deviate so markedly from plausibility.

D.  Conclusion

The ridership forecasts for the SCMAGLEV are vastly over-estimated.  Predicted ridership on the SCMAGLEV is a minimum of two, and up to three, orders of magnitude higher than what has been observed on, and can reasonably be forecast for, the Acela.  One should not be getting predicted ridership that is more than 100 times what one observes on a comparable, existing (and thus knowable), service.

With ridership on the proposed system far less than what the project sponsors have forecast, the case for building the SCMAGLEV collapses.  Operational and maintenance costs would not be covered, much less any possibility of paying back a portion of the billions of dollars spent to build it, nor will the purported economic benefits follow.

However, the harm to the environment will have been done.  Even if the system is then shut down (due to the forecast ridership never materializing), it will not be possible to reverse much of that environmental damage.

The US very much needs to improve its public transit.  It is far too difficult, with resulting harm both to the economy and to the population, to move around in the Baltimore-Washington region.  But fixing this will require a focus on the basic nuts and bolts of operating, maintaining, and investing in the transit systems we have, including the trains and buses.  This might not look as attractive as a magnetically levitating train, but will be of benefit.  And it will be of benefit to the general public – in particular to those who rely on public transit – and not just to a narrow elite that can afford $120 fares.  Money for public transit is scarce.  It should not be wasted on shiny new toys.