Overview

This is a fully integrated replication pipeline whose final outcome is the paper manuscript in pdf format.

You can download the replication package here (about 46GB).¹

General structure

The main (root) folder contains the following folders:

programs/: contains all R and python codes that replicate the results of the paper.
inputs/: contains the data inputs used by the replication codes.
output/: where all processed data and simulation results are stored.
results/: where all results (figures, tables, and other tex results) are stored.

It also contains the JMP_README.html readme (this document).

Replicating the results

The paper is reproduced by executing, in this specific order, the following R and python codes:

data.R: pre-processes all the raw data.
quant.py: performs the quantification of the model as in Section 5.
cfac.py: runs all the counterfactual simulations of Sections 4.2, 5.5, and 6.
results.R: uses the previous inputs to reproduce all figures and tables of the paper. It also renders the .tex source file of the paper, producing the manuscript in .pdf format.

Inputs

The set of inputs (data and code/subroutines) required by 1-4 are:

Data:

inputs/conte-etal-2021/: global temperature data from Conte et al. (2021).
inputs/faostat-agric-prod/: country-crop-level agricultural production in US$ from FAOSTAT.
inputs/gaez/: raw GAEZ data (IIASA and FAO, 2012) on attainable/potential yields (for several crops, time periods, and technology assumptions), effective production, and harvested land by the early 21st century.
inputs/gam-friction/: friction surface of the Glabal Access Map/Accessibility to Cities’ project (Weiss et al., 2018).
inputs/g-econ-v4/: raw G-Econ data (Nordhaus et al., 2006)
inputs/ghsp/: raw gridded population data from the Global Human Settlement Project (Florczyk et al., 2019).
inputs/groads/: rasterized African and EU road infrastructure from the gROADS database (CIESIN, 2013).
inputs/ipums/: raw internal migration data and time-consistent shapefiles of all countries and subnational regions from IPUMS (2020).
inputs/itpd-e-trade/: country-pair-industry-level data on trade flows from the ITPD-E database (Borchert et al., 2021).
inputs/mig-data-abel-cohen/: country-pair-level data on migration stocks and gross flows from Abel and Cohen (2019).
inputs/ne_50m_admin_0_countries/: shapefile of the world (ADM0) from Natural Earth Data.
inputs/ne_50m_lakes/: shapefile of all lakes in the world from Natural Earth Data.
inputs/ne_10m_populated_places_simple/: shapefiles of all populated places in the world from Natural Earth Data.
inputs/ne_10m_ports/: shapefile of all ports and harbors in the world from Natural Earth Data.
inputs/other/aoi2.rdata: SSA’s shape in .rdata format (used to render maps).
inputs/other/exchange_rates.rdata: exchange rates for SSA country currencies and USD.
inputs/other/otherlists.rdata: inputs for the whichCountry() function (in loadfunctions.R).
inputs/un-pop-data/: cleaned .xls with estimates of future national populations from United Nations and Social Affairs (2019).
inputs/wb-va-agric/: country-level agricultural value added from the World Bank Development Indicators.
inputs/wb-gdp-pc/: country-level GDP per capita growth from the World Bank Development Indicators.
inputs/wfp-vam/: geocoded crop price data from the WFP-VAM program.

Paper:

inputs/tex-paper/JMP_brunoconte.tex: source of the manuscript in .tex format.
inputs/tex-paper/00preamb.tex: .tex preamble of manuscript (packages, commands, etc).
inputs/tex-paper/00appendix.tex: source of the appendix in .tex format.
inputs/tex-paper/00bib.tex: .tex file that calls the bibligraphy of the paper.
inputs/tex-paper/00bib.bib: bibtex-format of all cited references in the paper.
results/figures/rcp_sres_comparison*.png: figures from IPCC (2012) that is used in the Appendix.

Auxiliary programs and files:

programs/loadfunctions.R: loads several functions used to process data (e.g. distance-calculating algorithms).
programs/model_2024.py: contains all functions of the model (Section 4.1). It is used by quant.py and cfac.py.
programs/model_line.py: contains the functions and the simulation codes of the economy represented in a line (in Section 4.2).

Processing data and preparing model inputs (`data.R`)

The code data.R runs all data-processing tasks – make sure to set the right working and temporary directories in lines 16-17.² It consists of a sequence of 15 code slices that run for different cty geographical regions in an outer loop.³

Slice 0: creates the folders where the output will be stored.
Slices 1-2: create the spatial grid based on the G-Econ data.⁴
Slices 3-6: process and export each cell’s coordinates, areas (in km$^2$), and the GHSP population.
Slices 7-10: process and export the GAEZ/FAOSTAT agricultural data (i.e., potential yields, effective production, and harvested land).
Slice 11: processes and exports the international and internal raw migration datasets (from Abel and Cohen, 2019, and IPUMS, respectively). It also exports migration transition matrices used by them model.⁵
Slice 12: processes and exports the ITPD-E trade data. Analogously to 11, it also exports an $N \times N \times K$ dummy matrix of sector-cell-pairs for which trade data is observed.
Slice 13: processes the raw WFP-VAM crop price data, estimates the location-crop FE $c_j^k$ (Appendix B.7), and exports prices $\{ P_j^k \}$ dispersion and an $N \times K$ dummy matrix for location-crop combination for which price data is observed.
Slice 14: produces additional inputs for the robustness exercises, such as the $\Lambda^b(T_i)$ function for climate damages in amenities from Cruz and Rossi-Hansberg (2023), the $g^K(T_i)$ function for climate damages in the non-agricultural sectors (Conte et al., 2021), and others.
Slice 15: computes the bilateral travel distances between all $i,j$ location pairs. It is the most time-consuming data-processing taks.⁶
Slice 16: computes all the outcomes from slices 0-15 for SSA with the rest of the World.

The outcome is exported to data/outputs/*/ to be subsequently used as inputs in the model (note that * stands for region cty; e.g., SSA is defined as ssa_2023).

Quantifying the model (`quant.py`)

The quant.py code links the model to the data as in Section 5. It requires a number of libraries (e.g., numpy) installed within the local python IDE.⁷ As with quant.py, it loops over several regions (called, in this case, country), executing:

Slice 1: loads several region-specific inputs (produced with data.R) and sets parameter values as in Table 1.
Slice 2.1: quantifies technology/consumption-related parameters/fundamentals with a two-stage procedure (as in Section 5.3 in the paper). The first stage (inner loop, 2.1.1. in the code) calibrates $\mathbf{T} \equiv \{ \{A_i^K\}_i, \{ b_i^k \}_{i,k}, \{ \Omega_a, \Omega_K \} \}$ conditional on $\mathbf{t} \equiv \{ \tau^F, \delta \}$. The second stage (outer loop, 2.1.2 in the code) performs the GMM that pins down $\mathbf{t}$ and bootstraps its standard errors. The exported output (in either fundamentals/ or parameters/ folders) is the set of bk_*.csv, omega_*.csv, tau_F_*.csv, and delta_*.csv elements, their standard errors, and some plots with the solution of optimization problems.
Slice 2.2: analogously quantifies the location choice-related fundamentals and parameters (as in Section 5.4 in the paper), where $\{ u_i, m_c \}_{i,c}$ are pinned down in an inner loop and $\phi$ in an outer loop. The exported output is the u_*.csv and mc_*.csv fundamentals, the phi_*.csv parameter, and plots related to the solution of the optimization problem.

Counterfactuals (`cfac.py`)

The cfac.py code uses the inputs from the previous tasks to run and export the outputs of ALL counterfactual simulations. It loops over several cty regions for different scenarios; it takes about 6-8 hours in a 8-core computer.

Slice 1: loads several region-specific inputs (from data.R) and parameter values as in Table 1 and 2 (produced in quant.py).
Slice 2.1: simulates the counterfactuals from Sections 5.5 and 6.
Slice 2.2-2.3: simulates the counterfactuals for the robustness exercises from Section 6.4.
Slice 3: executes model_line.py, which simulates the economy represented as a line as in Section 4.2.

Note that, after solving the model for each scenario, the model.export_output() function exports all outcomes in .csv format, in a standardized way and in each simulation’s folder.

Reproducing results and the paper (`results.R`)

The code results.R processes all of the results and generates all tables, plots, and maps of the paper. Make sure to set the right working and temporary directories in lines 24-25.

Importantly, it extensively uses the functions genAggregateStats() and addToGrid(), which I developed to processes and/or and aggregate the simulation results (in terms of climate change effects) at the SSA and/or country levels.

Finally, results.R compiles the jmp_brunoconte.tex source file. Thus, once concluding all of the previous steps, the replcation package will produce the paper in .pdf format in the root folder.

References

Abel, G.J. and Cohen, J.E., 2019. Bilateral international migration flow estimates for 200 countries. Scientific data, 6(1), pp.1-13.
Borchert, I., Larch, M., Shikher, S. and Yotov, Y.V., 2021. The international trade and production database for estimation (ITPD-E). International Economics, 166, pp.140-166.
Conte, B., Desmet, K., Nagy, D.K. and Rossi-Hansberg, E., 2021. Local sectoral specialization in a warming world. Journal of Economic Geography, 21(4), pp.493-530.
Cruz, J.L. and Rossi-Hansberg, E., 2023. The economic geography of global warming. The Review of Economic Studies (forthcoming).
Florczyk, A.J., Corbane, C., Ehrlich, D., Freire, S., Kemper, T., Maffenini, L., Melchiorri, M., Pesaresi, M., Politis, P., Schiavina, M. and Sabo, F., 2019. GHSL data package 2019. Luxembourg, EUR, 29788(10.2760), p.290498.
IIASA and FAO, 2012, Global Agro-Ecological Zones (GAEZ v3.0).
IPUMS, 2020, Integrated Public Use Microdata Series, International: Version 7.3 [dataset], Technical Report, Minnesota Population Center, Minneapolis, MN.
Nordhaus, W., Azam, Q., Corderi, D., Hood, K., Victor, N.M., Mohammed, M., Miltner, A. and Weiss, J., 2006. The G-Econ database on gridded output: methods and data. Yale University, New Haven, 6.
United Nations and Social Affairs, 2019, World Population Prospects 2019: Highlights.

Most of the inputs for the replication are not available because of editorial/copy right issues, but are available upon request.↩︎
Lines 16-21 load the main required R libraries and line 13 sets a TRUE/FALSE for installing ALL the required libaries (in lines 25-46). Make sure to have then all installed from CRAN.↩︎
These “regions” are SSA, the EU, SSA with the rest of the World, or SSA for different models (e.g., with a single crop).↩︎
The grid is created based on G-Econ gridcells with non-zero population. Some coastal/bordering cells are manually removed.↩︎
Specifically, it exports a $N \times N$ dummy matrix for all $i,j$ combinations that stand for internal migration choices. Analogously, it exports a $C \times N \times N$ dummy matrix where every slice along the $C$ (country) dimension contains dummies for the $i,j$ combinations that stand for migration into country $C$.↩︎
For that, it is divided in subsets and computed in parallel. Executed with 6 cores, each cty region requires 4-10 hours.↩︎
These are numpy, timeit, os, pandas, matplotlib, and scipy.↩︎

Climate Change and Migration: the case of Africa - Replication Package

Bruno Conte

2025-06-16

Overview

General structure

Replicating the results

Inputs

Processing data and preparing model inputs (`data.R`)

Quantifying the model (`quant.py`)

Counterfactuals (`cfac.py`)

Reproducing results and the paper (`results.R`)

References

Climate Change and Migration: the case of Africa - Replication Package

Bruno Conte

2025-06-16

Overview

General structure

Replicating the results

Inputs

Processing data and preparing model inputs (data.R)

Quantifying the model (quant.py)

Counterfactuals (cfac.py)

Reproducing results and the paper (results.R)

References

Processing data and preparing model inputs (`data.R`)

Quantifying the model (`quant.py`)

Counterfactuals (`cfac.py`)

Reproducing results and the paper (`results.R`)