The Audit Trail

The 176% Gap: Ecoinvent vs Local Data on China Solar and Wind

Ecoinvent and HiQLCD disagree on China solar GWP100 by 176%, on wind by 244%. Which gaps are representativeness, which are boundary choices, and what your tooling should do with a spread that wide.

Pull the GWP100 factor for one kWh of Chinese solar PV from Ecoinvent v3.12.0, cut-off — applying IPCC GWP100 characterization to that inventory — and you get 0.080 kg CO2-eq. Pull the same kWh from HiQLCD v1.4.0, the China LCI database, and you get 0.029. The two numbers differ by 176%. That gap now sits inside your export carbon label, your product carbon footprint, your carbon-market price. It is not a rounding question. It decides whether your product reads as clean or dirty at the border.

Same kWh. Same horizon. Same characterization. The number you ship turns entirely on which database your background data came from. So why do the two disagree, where is the disagreement a representativeness problem and where is it a boundary choice, and what should your tooling do when it sees a spread this wide.

The spread, by source

Five China electricity sources, GWP100, kg CO2-eq per kWh, HiQLCD v1.4.0 against Ecoinvent v3.12.0 cut-off:

Solar PV — 0.029 vs 0.080. A 176% disagreement.
Wind, onshore — 0.009 vs 0.031. A 244% disagreement.
Coal thermal — 1.033 vs 1.210. Within 17%.
Run-of-river hydro — 0.012 vs 0.005. HiQLCD higher.
Nuclear PWR — 0.006 vs 0.008. Close.

Note where the databases agree and where they don’t. They land almost on top of each other for nuclear. They sit inside a coal plant’s own noise on thermal. They invert on hydro. And they split by nearly 3x on the two sources your low-carbon story leans on hardest — solar and wind. (The HiQLCD solar and wind figures assume a utility-scale capacity factor with infrastructure fully amortized over generation; those low anchors fall straight out of the boundary choices, which a verifier is entitled to ask you to state.)

Where the gap comes from

For several key manufacturing sub-processes — wafer and cell energy, certain turbine components — specific Ecoinvent records still lean on European or global-average inventories rather than China-built data, so the modeled energy intensity doesn’t reflect China’s actual cell-line efficiency. The point is not that China data does not exist — it is that the particular records in play are not geographically or technologically representative. China makes most of the world’s PV cells and a large share of its turbines, at an efficiency a transplanted inventory simply can’t see. On the two technologies where China’s industrial scale matters most, the databases part company on the manufacturing burden.

The grid underneath compounds it. Ecoinvent represents China electricity through regional market mixes dominated by hard-coal and lignite, with several technologies modeled at global or rest-of-world average wherever China-specific inventories are missing. HiQLCD resolves the same picture from a China-built energy inventory at provincial precision, across all 31 provinces. When the grid powering the factory leans on a coal-weighted proxy, the manufacturing footprint inherits that proxy’s carbon — and the background-grid gap stacks on top of the manufacturing-inventory gap.

Two credible databases, looking at the same Chinese kWh, hand you two different numbers. The spread is the finding, not the footnote.

Not every gap is a representativeness problem. On run-of-river hydro, HiQLCD reads higher — 0.012 against 0.005 — because it amortizes the civil works (diversion weir, intake, penstock, powerhouse: concrete and steel) over generation at a low capacity factor, while Ecoinvent draws the boundary elsewhere. That’s an infrastructure-amortization boundary disagreement, not an accuracy defect — a question about what counts as part of generating the electricity. Nuclear (0.006 vs 0.008) and coal (1.033 vs 1.210) sit close; the residual is consistent with known boundary differences in the uranium fuel cycle on one, and plant-to-plant variability on the other. The whole job is telling the two apart. One you correct. One you document.

The averaging trap

Faced with a 0.029-versus-0.080 spread, most automated factor-matching takes the convenient route: it picks one, or splits the difference. Either move buries the most important fact about your solar row — that two credible databases disagree about it by 176% — under one tidy number your auditor can’t reconstruct. A naive arithmetic average lands on 0.054, a figure no database holds and no one chose. Pick silently and you’ve made a methodology decision your reasoning chain never recorded.

This is the failure mode the Audit Trail exists to name. The spread is the signal. Hiding it is the error.

What Cortex does instead

Cortex searches fourteen LCA databases — HiQLCD, Ecoinvent, EF, CarbonMinds, and others — and returns top-k candidates side by side, not top-1. For your China solar row, that puts HiQLCD’s 0.029 and Ecoinvent’s 0.080 next to each other on screen, each carrying its GWP100 value, its region, its system model, its source record, and a Pedigree-Matrix-style DQI score across five dimensions: Temporal, Geographic, Technology, Completeness, Reliability. On the Technology and Geographic axes, the China-built dataset and the transplanted one separate visibly. You see which candidate scores higher on representativeness where the process and the region match your study. Representativeness isn’t accuracy, so the call remains yours.

When the cross-database GWP spread for the same material runs past 2x — as it does for solar and wind here — Cortex neither averages nor picks. It pauses and hands the decision back. Choosing between a China-built factor and a transplanted one is a methodology call: it turns on your goal and scope, your system boundary, whether your verifier accepts HiQLCD as a regional source. Cortex pauses where automation would break an audit — a spread past 2x, a proxy that isn’t an exact fit, coverage too thin to defend, restricted data it can’t substitute silently — and the practitioner decides. The decision, and the reason for it, goes into the reasoning chain.

Cortex doesn’t pick a system model for you, and it files nothing. It returns the candidates with their provenance, holds the system models apart so cut-off never quietly mixes with APOS, and produces a reasoning chain you can hand to a verifier. You make the call; Cortex records that you made it, and on what basis.

One kWh, defended

A 176% disagreement on solar isn’t a database curiosity. It’s the difference between a product carbon footprint your verifier signs and one they send back. The honest answer to “which factor” is rarely a number — it’s the spread, scored, plus a recorded decision about which source fits your scope. Cortex puts both numbers in front of you, scores them on Geographic and Technology representativeness, and stops where a tidy average would have quietly broken the audit.

Ask Cortex about a China electricity factor and walk the two numbers back to where they part: cortex.hiq.earth/chat.

— HiQ Cortex Team