What’s the single largest source of CO2 emissions in the Southeast? A 10 million ton data discrepancy!
What? Huh? Why is a data discrepancy a blog? (UPDATE: Please see responses to reader suggestions at at the end, as well as in the comments.)
President Obama’s Clean Power Plan will eventually regulate the emission of carbon dioxide from the nation’s power plants – and there are some pretty big power plants out there. Just one of the Tennessee Valley Authority’s (TVA) coal units at its Cumberland coal plant, Unit 2, pumps out around 7 million tons of carbon dioxide each year! Georgia Power’s Scherer coal plant is the power sector’s largest carbon emitter in the entire country, with 4 coal units that emit an average of 5 million tons of carbon dioxide per unit. But dwarfing even these staggering emission amounts is the 10 million ton discrepancy between the carbon emission databases maintained by the US Energy Information Administration (EIA) and the US Environmental Protection Agency (EPA). This large discrepancy between two of our country’s major carbon emissions databases could represent thousands of megawatts of uncertainty in how much new, clean generation is needed to reduce the carbon footprint of the Southeastern power sector.
We have been digging into the carbon emission data at SACE and continually we came across some pretty surprising problems with the raw data that experts are relying on to plan for implementation of the Clean Power Plan. There are a few details that make the data analysis a bit messy. EPA and EIA don’t always use the same numeric codes for power plants, and EIA uses three different data file sources to contain all of the fuel consumption data. But once we sorted through all these twists and turns, we thought the EPA and EIA data would match up pretty well. After all, the utilities report the data. The definitions appear to be the same. This is not even algebra, this is addition and multiplication. As you can see below, the math behind the data analysis is pretty simple:
1. Utilities report carbon emission data to the EIA and the EPA.
2. Data collected by the EPA includes both CO2 emissions, in tons, and heat input (heat input = how much energy is in the fuel burned, measured in million of British Thermal Units, or BTUs, the standard measurement for the energy content of fuels, represented as “mmBTU”).
3. Data collected by the EIA includes only fuel consumption, also reported as mmBTU.
4. To determine the equivalent CO2 emissions based on the EIA data you must multiply the emission rate for the fuel type times the mmBTU. (This is actually what EPA tells the utilities to do in their reporting, see step 2 above.) We use EPA’s emission factors provided in the Clean Power Plan rulemaking. Utilities do not report CO2 data to EIA.
5. Compare the total carbon emissions from EIA and EPA.
Instead of matching, however, EIA and EPA’s data showed some large discrepancies, as represented below:
EPA Reports 2-3% Higher CO2 Emissions than EIA Data
|CO2 Emissions from Southeastern Power Plants|
(million short tons)
|EPA: Air Markets Database||451||420||394||422|
|EIA: Form 923||441||410||384||412|
|Percent Difference||+2.2 %||+2.5 %||+2.9 %||+2.4 %|
(Note: These numbers represent power plants currently serving TVA, Southern Balancing Area, the Carolinas (excluding PJM), and Florida, if reporting to the EPA Air Markets Database.)
At this point in the blog, a reader might expect me to explain why there is a difference in the data. But after several weeks of research, and reaching out to both EIA and EPA, I don’t even have a plausible theory. We’ve gone back over the data to ensure it matches the original source (it does). We’ve looked at monthly data to see if there is a consistent pattern in the discrepancies for specific coal units (there isn’t). We’ve checked the underlying definitions of what is being reported in the data (it does appear that both EIA and EPA use the same definition for measuring the energy in fuel). There are actually two ways to measure energy in fuel but EIA requests reporting in Higher Heating Value (HHV) as does EPA (this was the only good theory I’ve found … as EIA and EPA actually use different definitions of power generation which affect those data.) Although I reached out to EPA about the discrepancy, they brushed off the question by admitting they don’t cross-reference their data with EIA and therefore don’t have any ideas about why the two data sets don’t match. I reached out to EIA, but have yet to hear back. Finally, I reached out to numerous experts, both consultants and staff at major electric utilities and utility organizations. All of them agree this is both puzzling and significant.
CLARIFYING UPDATE: Also, the error is mostly (entirely?) attributable to differences in reported fuel consumption. So FROM HERE ON … all data are reported in fuel consumption (mmBTU) and NOT CO2 emissions …
What are the consequences of a 2-3% error in data reporting?
It can matter a lot under the Clean Power Plan. At a very simple level, a 2-3% error can be related to an increase (or decrease) in new clean energy investment. A 2-3% error is equivalent to about 5,000 MW of generation, which is a pretty big deal if you are modeling the future energy mix, or projecting the size of renewable energy markets.
The implications are different depending on whether a power plant is being regulated under the Clean Power Plan’s emission rate standard or the alternative mass emission limit.
Under the emission rate standard, a power plant owner starts with reported emissions, divides by generation, and thus calculates a rate in lbs/MWh. The plant owner then must either purchase additional emission rate credits (ERCs), or may sell excess, depending on whether the plant’s emission rate is above or below the standard. Using the EIA data, most (but not all) plants are closer to achieving the standard than using the EPA data by an average of roughly 2-3%. So if utilities begin reporting EIA data to EPA, it will look like a 2-3% cut in emissions without achieving any actual emission reductions.
There are two issues that arise if the plant is regulated using a mass based standard, under which a power plant’s reported emissions must fall below a certain standard (tons per year). The first issue is the same as with the emission rate standard – the possibility that utilities could switch from EPA to EIA data leads to a 2-3% cut in emissions without achieving anything.
The second issue is that EPA’s proposed model allocation of emissions allowances are based on historic reported emissions – emissions reported to EPA. So for plants that report more fuel use to EPA than to EIA, the plant might be allocated more allowances using EPA data than EIA data. In the data below, I’m summarizing fuel consumption (not CO2) since that is what is actually reported to EIA (and EPA).
Examples of Power Plant Fuel Reporting Discrepancies, Annual Data
|Reported fuel consumption at|
selected Southeastern Power Plants
|Plant Roxboro Unit 2|
|Plant Scherer Unit 3|
|EPA: Air Markets Database||24.0||38.4||34.4||30.0||71.3||58.5||48.1||52.7|
|EIA: Form 923||26.0||42.2||40.4||33.9||64.0||53.3||45.0||50.2|
As these two plants illustrate, the reported discrepancy varies in both absolute and percentage terms from year to year. The pattern varies by plant and by unit. If one assumes that the EIA reported data are “correct” (whatever that means), then Duke Energy’s Plant Roxboro would be under-allocated CO2 allowances under EPA’s proposed model rule – making it more difficult to comply. If EIA data are “correct,” then Plant Roxboro emissions would have to be cut by 9-15% just to match what has been reported to EPA. Under the same assumption, Plant Scherer could simply begin reporting “correctly” to EPA and cut its reported emissions by 5-11%.
Example of Power Plant Fuel Reporting Discrepancies, Monthly Data
|Reported fuel consumption, selected months|
|Plant Barry Coal Units|
|Plant Barry Gas (NGCC) Units|
|Jan 2013||Nov 2013||Jan 2014||Feb 2014||Jan 2013||Nov 2013||Jan 2014||Feb 2014|
|EPA: Air Markets Database||6.6||2.1||5.5||5.2||7.4||6.1||5.0||5.0|
|EIA: Form 923||5.8||1.5||5.6||5.3||6.8||5.7||4.8||4.8|
Even when reviewing data for single plants (or even single units), there doesn’t appear to be a pattern that is consistent within the plant across different fuel types (or even comparing one coal unit at the plant to another coal unit).
If you have read this far, maybe you have a suggested explanation? Please comment!
Updates: Responses to Comments
I will summarize key information as it comes in here, and respond in more detail to this in the comments as time permits.
- Several people have inquired whether this could be due to the difference between net generation and gross generation (MWh). For example, Daniel Shawhan and Jubo Yan have done extensive work trying to reconcile EPA and EIA databases to address these generation issues. The short answer is no – the underlying issue here is related to fuel consumption, not generation. Generation data is an issue – don’t use EPA’s CAMD “gross generation” measure to study compliance with the CPP. It isn’t the measure EPA uses!
- I’ve been alerted to a paper by Jeffery C. Quick, which investigated the discrepancy between CEMS-based data in EPA’s CAMD and the fuel consumption / emission factor method using EIA data. Quick found no overall discrepancy between these two data sources, but he only focused on a subset of units – those with CEMS-based data. Oversimplifying greatly, he found that the fuel consumption / emission factor method appears to be better. I plan to follow up here as Quick has evidently researched this question in a direction that I have not.
- Another issue raised by commentators is whether this has to do with the difference between total fuel consumption, and fuel consumption for electricity only. EIA uses these two measures to report its calculation of the fuel used to generate electricity at CHP plants. For all non-CHP plants (at least in the data I’ve reviewed), there is no difference between these values. It is also worth noting that the two fuel consumption measures reported by EIA (EPA has only one) are not related to the difference between net and gross generation.
- I have also been asked about whether the units selected are matched up properly. We have spent many, many hours on this. We have mapped EPA and EIA generators and boilers to common units (e.g., CC unit 6). The units reported in this blog have data in both CAMD and EIA data. There are many units that only report to EIA, we are studying those as well, but they are excluded from this analysis.
- Imputation is an interesting suggestion. A colleague quotes EIA, “Imputation: For select survey data elements collected monthly, regression prediction, or imputation, is done for missing data, including non-sampled units and any non-respondents. For data collected annually, imputation is performed for non-respondents. For gross generation and total fuel consumption, multiple regression is used for imputation … Only approximately 0.02 percent of the national total generation for 2010 is imputed, although this will vary by State and energy source.” This could be a factor, but it seems unlikely to explain much of the difference: consider the Plant Barry example above, it is extremely unlikely that a major plant like Plant Barry would require routine imputation of data. But it could be a factor here, considering there are likely to be multiple reasons for the issue.
Keep the comments coming, unfortunately I can’t override the automatic “comments are closed” deadline. But if you email me at wilson [a+} cleanenergy*org then I will add your thoughts here.