Will COVID-19 fiscal recovery packages accelerate or retard progress on climate change? — DoOperator Research

Authors	Cameron Hepburn, Brian O’Callaghan, Nicholas Stern, Joseph E. Stiglitz, Dimitri Zenghelis
Journal	Oxford Review of Economic Policy
Year	2020
DOI	10.1093/oxrep/graa015
Citations	756

TL;DR

A survey of 231 economic experts from G20 countries identified five fiscal recovery policies—clean physical infrastructure, building efficiency retrofits, education/training investment, natural capital investment, and clean R&D—that simultaneously deliver high economic multipliers and strong climate benefits, offering a roadmap for governments designing post-COVID stimulus packages.

What they tested

The researchers tested the perceived performance of 25 distinct fiscal recovery policy archetypes across four dimensions: speed of implementation (how quickly the policy can be deployed), economic multiplier (the total economic output generated per unit of government spending), climate impact potential (the net effect on greenhouse gas emissions, positive or negative), and overall desirability (a composite judgment of the policy's attractiveness given current conditions). The 25 archetypes ranged from traditional stimulus measures (e.g., road building, airline bailouts) to climate-focused interventions (e.g., renewable energy subsidies, reforestation, clean R&D tax credits). The study also compared results between high-income countries and lower- and middle-income countries (LMICs) to identify context-specific priorities.

Who was studied

The sample consisted of 231 central bank officials, finance ministry officials, and other economic experts from G20 countries. The authors do not provide a detailed demographic breakdown (age, gender, years of experience), but the respondents were selected for their professional expertise in fiscal policy and economic recovery. The survey was conducted in May–June 2020, during the early phase of the COVID-19 pandemic. The response rate is not reported, and the authors acknowledge that the sample is a convenience sample of experts willing to participate, not a random or representative sample of all G20 economic officials.

How they measured it

The study used a structured online survey. For each of the 25 policy archetypes, respondents rated the policy on four dimensions using a 5-point Likert scale (1 = very low, 5 = very high). The survey also included open-ended questions allowing respondents to add comments or suggest additional policies. The authors then calculated mean scores for each policy on each dimension, and ranked policies by their combined score on economic multiplier and climate impact potential. For the climate impact dimension, respondents were asked to consider the net effect on greenhouse gas emissions, accounting for both direct and indirect effects (e.g., a road-building project might increase emissions directly through construction and indirectly through induced driving). The authors also conducted a separate analysis for LMICs by asking respondents to rate policies specifically in the context of lower- and middle-income countries.

Methodology

Study design: This is a cross-sectional expert elicitation survey. It is not an experiment, a randomized controlled trial, or a meta-analysis. The researchers did not manipulate any variables or assign participants to conditions. Instead, they collected subjective judgments from a panel of experts at a single point in time.

Why this design matters: Expert elicitation is a well-established method in economics and policy analysis when direct empirical data is unavailable or when the question involves future-oriented, counterfactual scenarios (e.g., "What would be the economic multiplier of a hypothetical green infrastructure program?"). The COVID-19 crisis was unfolding in real-time, and no historical data existed on the effects of pandemic-era fiscal recovery packages on climate outcomes. The survey allowed the researchers to aggregate the collective judgment of experts who had relevant domain knowledge.

What this design can and cannot prove: This design can reveal the perceived performance of different policies among a group of experts. It can identify policies that are broadly viewed as having high potential on multiple dimensions. It can also highlight areas of consensus and disagreement. However, this design cannot prove that any particular policy actually produces a given economic multiplier or climate impact. The results are opinions, not empirical measurements. The study does not test any causal hypothesis—it does not show that clean infrastructure causes higher economic growth or lower emissions. It shows that experts believe it would. The design also cannot account for implementation failures, political constraints, or unintended consequences that might differ from expert expectations.

Duration: The survey was administered at a single time point (May–June 2020). There is no follow-up or longitudinal component. The results reflect expert views during a specific moment of the pandemic, when uncertainty was high and many governments were still designing their initial recovery packages.

Statistical approach: The authors report mean scores and rankings. They do not report standard deviations, confidence intervals, or formal statistical tests (e.g., t-tests, ANOVA) comparing policies. This is a descriptive analysis, not an inferential one. The lack of inferential statistics means we cannot assess whether the differences between policies are statistically significant or could be due to random variation in expert opinions.

Major methodological weaknesses:

No random sampling: The sample is a convenience sample of experts willing to participate. This introduces selection bias—experts who chose to respond may have systematically different views from those who did not.
No blinding: Respondents knew the purpose of the survey (to evaluate climate and economic impacts of recovery policies). This could introduce social desirability bias, where experts overstate the climate benefits of "green" policies to appear aligned with scientific consensus.
No objective validation: The survey relies entirely on subjective ratings. There is no attempt to validate these ratings against actual economic or emissions data.
Single time point: Expert views may have shifted as the pandemic evolved and new data emerged. The results are a snapshot, not a stable prediction.
Vague definitions: The policy archetypes are broad (e.g., "clean physical infrastructure" could include anything from solar farms to electric vehicle charging stations to smart grids). Different experts may have interpreted the same archetype differently, reducing comparability.

Key findings

Top five policies for combined economic multiplier and climate impact (high-income countries):
1. Clean physical infrastructure (mean score not reported as a single number; ranked #1 on both multiplier and climate impact)
2. Building efficiency retrofits (ranked #2 on multiplier, #3 on climate impact)
3. Investment in education and training (ranked #3 on multiplier, #5 on climate impact)
4. Natural capital investment (e.g., reforestation, wetland restoration; ranked #4 on multiplier, #2 on climate impact)
5. Clean R&D (e.g., renewable energy research, carbon capture; ranked #5 on multiplier, #4 on climate impact)
Policies with high economic multiplier but low climate impact (or negative impact):
- Road building (high multiplier, but rated as having negative climate impact)
- Airline bailouts (moderate multiplier, negative climate impact)
- General tax cuts (high multiplier, neutral to negative climate impact)
Policies with low economic multiplier but high climate impact:
- Renewable energy subsidies (moderate multiplier, high climate impact)
- Reforestation (low multiplier, high climate impact)
Differences for lower- and middle-income countries (LMICs):
- Rural support spending (e.g., agricultural extension, rural infrastructure) was rated as particularly valuable in LMICs, with high multiplier and moderate climate impact
- Clean R&D was rated as less important in LMICs, likely because these countries have less existing research infrastructure and more immediate needs for basic services
- Building efficiency retrofits were rated as less feasible in LMICs due to lack of skilled labor and supply chains
Overall desirability ranking (combining all four dimensions):
- Clean physical infrastructure ranked #1
- Building efficiency retrofits ranked #2
- Natural capital investment ranked #3
- Education and training ranked #4
- Clean R&D ranked #5
Policies ranked lowest on overall desirability:
- Airline bailouts
- Fossil fuel subsidies
- New fossil fuel extraction infrastructure
- General tax cuts (without green conditionality)
Short-run impacts of COVID-19 on emissions: The authors note that global CO2 emissions fell by an estimated 17% at the peak of lockdowns in April 2020, but this was temporary and emissions rebounded quickly as restrictions eased. They estimate that the lockdown-driven emissions reduction was equivalent to roughly 2–3 years of the annual emissions reductions needed to meet the Paris Agreement targets—but warn that without structural changes, emissions will return to pre-pandemic levels or higher.
Medium-run behavioral shifts: The authors discuss plausible but uncertain shifts in human behavior post-pandemic, including increased remote work (reducing commuting emissions), reduced air travel (especially business travel), and increased demand for local food and outdoor recreation. They caution that these shifts are not guaranteed and depend on policy choices (e.g., whether governments invest in broadband infrastructure to support remote work).

Effect magnitude

The study does not report effect sizes in the traditional sense (e.g., Cohen's d, risk ratios). Instead, the key "effect" is the ranking of policies by expert consensus. The magnitude of the difference between the top-ranked policy (clean physical infrastructure) and the bottom-ranked policy (fossil fuel subsidies) is not quantified in standard deviation units or percentage points. However, the authors provide qualitative context: clean physical infrastructure was rated "substantially higher" than fossil fuel subsidies on both economic multiplier and climate impact, with the gap being large enough that the authors recommend governments prioritize the former and avoid the latter. For the emissions reduction from lockdowns, the magnitude is clear: a 17% drop in daily global CO2 emissions at peak, but this was temporary and not sustained. To put this in perspective, the annual emissions reduction needed to meet the Paris Agreement targets is roughly 7.6% per year from 2020 to 2030. The lockdown-driven reduction was about twice that annual target, but it lasted only a few weeks.

Limitations

Acknowledged by authors:

The sample is a convenience sample of experts, not a random sample, and may not be representative of all G20 economic officials.
The survey was conducted during a period of extreme uncertainty (May–June 2020), and expert views may have changed as the pandemic evolved.
The policy archetypes are broad and may be interpreted differently by different respondents.
The study does not account for political feasibility, implementation capacity, or distributional effects (who bears the costs and who reaps the benefits).
The analysis for LMICs is based on a smaller subsample of experts with LMIC experience, and the results may be less reliable.

Additional limitations a critical reader would note:

No inferential statistics: The authors do not report confidence intervals, p-values, or measures of inter-rater reliability. We cannot assess whether the differences between policies are statistically significant or could be due to random variation.
Social desirability bias: Experts may have rated "green" policies higher because they know the survey is about climate change, not because they genuinely believe those policies are superior on economic grounds.
Lack of blinding: Respondents knew the survey's purpose, which could inflate the perceived climate benefits of certain policies.
No objective validation: The study does not compare expert ratings to actual economic or emissions data from past stimulus packages. The ratings are purely subjective.
Single time point: The results reflect a snapshot of expert opinion during a specific crisis. They may not generalize to other economic contexts or future crises.
Potential conflict of interest: Several authors (Hepburn, Stern, Stiglitz, Zenghelis) are prominent advocates for climate action and green growth. While this does not invalidate the results, it could influence the framing of the research question and interpretation of findings.
No consideration of unintended consequences: The study does not explore potential negative side effects of the recommended policies (e.g., land use conflicts from natural capital investment, job displacement from clean R&D, or increased inequality from building efficiency retrofits that benefit wealthier homeowners).

Practical takeaways

For someone running their own n=1 experiment (e.g., a policymaker, activist, or investor testing the effectiveness of different advocacy strategies for green recovery):

What to test:

Test the effectiveness of advocating for one specific policy archetype (e.g., "clean physical infrastructure" or "building efficiency retrofits") versus a more general "green stimulus" message. The study suggests that specific, well-defined policies are more likely to gain expert and political support than vague calls for "green recovery."

Minimum meaningful duration:

For policy advocacy, a meaningful test period would be 6–12 months, which is the typical timeframe for fiscal recovery package design and legislative approval. For measuring actual economic or emissions outcomes, a minimum of 2–3 years would be needed to observe the effects of implemented policies.

What to measure (specific metrics):

Process metrics: Number of policymakers or experts who express support for the specific policy; number of mentions in policy documents or media coverage; speed of policy adoption (days from proposal to legislation).
Outcome metrics (if policy is implemented): Jobs created per $1 million spent (economic multiplier); tons of CO2 avoided per year (climate impact); speed of implementation (months from funding approval to project completion).
For self-experimenters (e.g., activists or investors): Track your own influence by measuring the number of meetings secured with policymakers, the number of op-eds or articles published, or the change in public opinion polls before and after your advocacy campaign.

Key confounds to control for:

Political context: The same policy may be more or less feasible depending on the ruling party's ideology, the state of the economy, and the level of public support for climate action. Control for this by comparing advocacy efforts in similar political contexts (e.g., same country, same government, same economic conditions).
Timing: The study was conducted during a unique crisis. Advocacy during a recession may be more or less effective than during normal times. Control for timing by running your experiment during a comparable economic shock or by comparing advocacy during crisis vs. non-crisis periods.
Messenger effects: The same policy message may be more persuasive coming from a scientist, a business leader, or a community organizer. Control for this by varying the messenger while keeping the policy message constant.
Framing effects: The study's results suggest that emphasizing both economic multiplier and climate impact is more persuasive than focusing on climate alone. Test different framings (e.g., "jobs and growth" vs. "climate emergency") to see which resonates more with your target audience.

What a positive result would look like:

For a policymaker: Your proposed policy (e.g., building efficiency retrofits) is included in the final fiscal recovery package, with funding levels at or above the median of expert recommendations (e.g., at least 0.5% of GDP).
For an activist: Your advocacy campaign leads to a measurable increase in public support for the specific policy (e.g., a 10-percentage-point increase in polling support) or a shift in media coverage from general "green recovery" to specific policy proposals.
For an investor: Companies in sectors aligned with the top five policies (clean infrastructure, building efficiency, education/training, natural capital, clean R&D) outperform the market over a 2-year period, as measured by stock price or revenue growth relative to a benchmark index.
For a self-experimenter: You successfully persuade at least one key decision-maker (e.g., a member of parliament, a finance ministry official, or a corporate CEO) to publicly endorse one of the top five policies, and you can attribute this change to your specific advocacy efforts (e.g., through a pre-post survey or a controlled comparison).

Read full paper →More Finance →