Lecture 01 - Regression Discontinuity Designs in Development Economics

class: center, middle, inverse, title-slide

.title[
# Lecture 01 - Regression Discontinuity Designs in Development Economics
]
.subtitle[
## Theory and Practice
]
.author[
### Bruno Conte
]
.institute[
### Barcelona School of Economics
]
.date[
### 08/Jul/2024
]

---

# RDD in Development Economics: This Course

- Introduce conceptual and practical aspects of <o>**regression discontinuity designs**</o>
 
 - What are regression discontinuity designs (RDD)?
 
 - How is it used in development economics research?
 
 - What is the best practice, and which tools do we need to work with it?

- Main goal: concepts + tools = practice with real-world applications
 - Concepts: theory behind causal inference and RDD assumptions
 - Tools: practical applications with `Stata` and/or ``R/RStudio``
 
---

# RDD in Development Economics: This Course
 
- Main philosophy: a course by a development economist interested in <g>policy evaluation</g>

- Rather than a course by an econometrician/statistician!

- This course benefitted from many people

- Joan Llull, Vincenzo Scrutinio, Diogo Britto, Leandro Magalhães, **Matias Cattaneo et al. (most importantly)**, ...

---

# RDD in Development Economics: This Course

This course: how to use RDD to empirically answer <o>**research questions of our interest**</o>.

.pull-left[
**You will learn**
- Concepts and theory of RDD

- Basic `R`/`Stata` programming

- Practioneer-oriented best RDD practices

- Introductory (eventually) data visualization
]

.pull-right[
**You will not learn**
- All state-of-art RDD methods

- Inference-related RD aspects (e.g., SE)

- To write an efficient `R`/`Stata` code*

- To solve every possible RDD/data problem*
]

.footnote[
[*] This is up to you.
]

---

# RDD in Development Economics: This Course

## Main References

1. Cattaneo, M.D., Idrobo, N. and Titiunik, R., 2019. A practical introduction to regression discontinuity designs: Foundations. *Cambridge University Press*.

2. Cattaneo, M.D., Idrobo, N. and Titiunik, R., 2023. A practical introduction to regression discontinuity designs: Extensions. arXiv preprint arXiv:2301.08958.

## Usefull reading

3\. Lee, D.S. and Lemieux, T., 2010. Regression discontinuity designs in economics. *Journal of economic literature*, 48(2), pp.281-355.

4\. Angrist, J.D. and Pischke, J.S., 2009. Mostly harmless econometrics: An empiricist's companion. Princeton university press.

---

# RDD in Development Economics: About us

## Instructor: Bruno Conte

- Assistant Professor of Economics, Universitat Pompeu Fabra since Jan 2024

- Previous positions: Università di Bologna, World Bank, IGC/LSE

- Research interests: Development and environment economics

- Tools: Causal inference, international trade, and spatial economics

- Approach: <o>theory + empirics</o> = answers to policy-relevant research questions

## What about you?

---

# RDD in Development Economics: Schedule

1. Introduction to treatment effects and RDD &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**[08/Jul/2024]**
 - Potential outcomes, treatment outcomes, and selection bias
 - Introduction to RDD and survey of applications

2. Sharp RD Design (Basics) &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; **[09/Jul/2024]**

3. Sharp RD Design (Extensions) + Fuzzy RD Design &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; **[10/Jul/2024]**

4. Spatial Regression Discontinuities &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**[11/Jul/2024]**

5. Additional RDD Methods &nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**[12/Jul/2024]**

---

# RDD in Development Economics: Logistics

- Lectures: every Mon to Fri from 11:30-13:30
    - 1 hour + (optional) 10' break + 45 mins (potentially mixed backgroud: **be patient**!)

- Course material: [webpage](https://brunoconteleite.github.io/07-rdd-bse/) and [syllabus](https://www.dropbox.com/scl/fi/pckgr26fquyk85irt8we4/24DE05_RDD_Development_Syllabus.pdf?rlkey=pf9s2uov6dmpg9803960ob1y8&st=sdf04mgu&raw=1)

## Practical sessions:

- Every Mon to Fri from 16:15-17:45 (your own computer + `Stata` and/or `RStudio`)

## End-of-course evaluation (ECTS-equivalent):

- Final "project" with conceptual and replication exercises (more about it next)

- Any <o>**questions**</o>?

---

# RDD in Development Economics: Logistics

Final project/end-of-course evaluation: <o>**not mandatory!!**</o> Who is interested in it?

- Students that need ECTS-equivalency (i.e., credits for the course)

- Important: it is OK if you requested equivalence but do not need it in practice!
  
- Students that feedback on skills aquired during the course

## Structure

At the end of practical classes, I will provide (theoretical and practical) exercises

- Final project: compilation of the answers of all exercises in one single document!

- Ideal setting: done progressively, handed in on Friday July 12

---

.center[
# Getting started: why causal inference for

# development economics?
]

---

# Intro: Why Causal Inference for Development?

Economists use **causal inference** to give empirical content to economic relations

> *<g>Causal inference</g> consists of establishing a cause-effect relationship between two economic outcomes. It aims at answering "what if" type of questions*

It is an important tool for <o>**policy evaluation**</o>, especially in low-income, developing contexts

- Do micro-credit programs foster development in rural economies?

- What are the consequences of cash transfer policies in developing settings?

- Can family-targetted policies attenuate gender inequality and/or intimate violence?

Governments and institutions invest billions every year in such (and other) policies

- Causal inference tools can quantify their effectiveness (cost-benefit)

---

# Causal Inference: Structural versus Treatment Effects

> *The <o>structural approach</o> fits theory-based (economic) models of individual behaviour to data for policy evaluation (through simulations)*

- **Main advantage**: versatility

- Use structural models as a "mini-lab" for counterfactual simulations (ex-ante and -post)

- Main limitations: <g>stylized assumptions</g>, complexity, and computational cost

- Examples:

- Income transfers and intimate violence (Ramos, 2016)

- School construction and rural-urban market outcomes (Hsiao, 2023)

- Rural insurance and agricultural productivity (Pietrobon, 2024)

---

# Causal Inference: Structural versus Treatment Effects

This course focus on the <o>**treatment effects**</o> approach

> *Treatment effects evaluate the ex-post impact of an existing policy by comparing targetted (treatment) individuals with untargetted (control) ones*

- **Main advantage**: transparency (it is comparison exercise that needs little assumptions)

- It revolutionized evidence-based economics (Angrist and Pischke, 2010)

- Main limitations: limited counterfactual answers, external validity and policy-invariance

- Examples (see Michal Kremer, Esther Duflo, Abhijit Banerjee, ...)

- Minimum wages policies and employment (Card and Krueger, 1993)
  
  - School construction and rural-urban market outcomes (Duflo, 2001)
  
  - Income transfers and development (Angelucci and De Giorgi, 2009)

---

# Treatment Effects and Causal Inference: Crucial Questions

Again, treatment effects = comparison of outcome distributions (treated versus control)

> *Main challenge for causal interpretation: ensure that the <g>control group is a reasonable counterfactual</g> for the outcomes of the treatment group in the absence of treatment*

Causal inference tools overcome this by addressing the following questions:

1. What is the causal relation of interest?

2. Which <o>research design</o> (or identification strategy) captures the causal effect of interest?

- Randomized control trials, instrumental variables, regression discontinuities, ...

3. What is the ideal inference method?

- Linear regressions, GMM, ...
 
---
 
.center[
# Basics of Treatment Effects:

# Potential Outcomes and Selection Bias
]
---

# Treatment Effects, Potential Outcomes, and Selection Bias

Consider a population of `$i$` individuals potentially treated by a policy. Then, define:

- `$Y_{i}(1) \equiv$` individual `$i$`'s outcome if treated, and `$Y_{i}(0)$` otherwise

- `$T_i \equiv$` treatment indicator (i.e., equal to one if `$i$` is treated)

`$Y_{i}(1), Y_{i}(0)$` are (unobserved) **potential outcomes** that relate to observed outcomes `$Y_i$` as
`$$Y_i = T_i Y_{i}(1) + (1- T_i) Y_{i}(0)$$`
Hence, it is impossible to infer *individual treatment effects*
`$$\tau_i = Y_{i}(1) - Y_{i}(0)$$`

---

# Treatment Effects, Potential Outcomes, and Selection Bias

Hence, we focus on infering different **features of the distribution** of `$\tau_i$`

- Average treatment effect (ATE):
`$$\tau_{ATE} = \mathbb{E}\left[ Y_{i}(1) - Y_{i}(0) \right]$$`
- Average treatment on the treated (ATT):
`$$\tau_{ATT} = \mathbb{E}\left[ Y_{i}(1) - Y_{i}(0) | T_i = 1 \right]$$`

These (theoretical) causal effects have different (policy) implications

- Does the treated group `$(T_i = 1)$` <g>differ systematically</g> from the control?

- Does that affect the policy implications of the `$\tau_{ATT}$` effects?

---

# Treatment Effects, Potential Outcomes, and Selection Bias

In practice, sample comparisons are made with <g>observed outcomes</g> `$Y_i$`. Define
`$$\tau^S = \bar{Y}_T - \bar{Y}_C = \frac{1}{N_1} \sum_{i = 1}^N T_i Y_i - \frac{1}{N_0} \sum_{i = 1}^N (1-T_i) Y_i$$`
as the (sample) comparsion between treated/control averages and the *population counterpart*
`$$\tau = \mathbb{E}\left[ Y_i | T_i = 1 \right] - \mathbb{E}\left[ Y_i | T_i = 0 \right]$$`
--
One can show that `$\tau$` can be rewritten as
`$$\tau=\underbrace{\mathbb{E}\left[ Y_{i}(1) - Y_{i}(0) | T_i = 1 \right]}_{\tau_{ATT}} + \underbrace{\mathbb{E}\left[ Y_{i}(0) | T_i = 1 \right] - \mathbb{E}\left[ Y_{i}(0) | T_i = 0 \right]}_{\text{selection bias}}$$`

---

# Treatment Effects, Potential Outcomes, and Selection Bias

> *<g>Selection bias</g> is the systematic differences between treatment and control groups in terms of potential outcomes in the absence of treatment*

Motivating examples:

- Does access to schools improve labor market outcomes (Duflo, 2001)?

- Which type of individuals enroll in university?
  
- Does rural microfinance affect household-level consumption smoothing, asset holdings, or occupational mobility? (Kaboski and Townsend, 2012; Banerjee et al., 2015)?

- Which villages are covered by (micro) financial institutions?
  
- ...

How to identify `$\tau_{ATT}$` with `$\tau$` (i.e., with a comparison of sample averages)?

---
 
.center[
# Identification of Treatment Effects:

# Basic Assumptions
]
---

# Assumptions for Identification: Independence
  
Obviously, in the absence of selection bias, `$\tau = \tau_{ATT} \text{ (and eventually} = \tau_{ATE} \text{)}$`

- Which are the needed identification assumptions?

Simplest case: <g>**independence of potential outcomes**</g>
`$$Y_{i}(1), Y_{i}(0) \quad \bot \quad T_i \quad \forall i$$`
Formally, it is represented by the equivalence on cummulative distributions `$F$`
`$$F(Y_{i}(1) | T_i = 1) = F(Y_{i}(1)) \\ F(Y_{i}(0) | T_i = 0) = F(Y_{i}(0))$$`
Under this assumption, `$\tau_{ATE} = \tau_{ATT} = \tau \rightarrow \tau^S$` is an unbiased estimate of `$\tau_{ATE}$`

---

# Assumptions for Identification: Independence

<o>Randomized Control Trials</o> ensure unconditional independence (see Banerjee, Duflo, Kremer)

Illustration: Angelucci and Di Giorgi (2009) "*How Do Cash Transfers Affect [..] Consumption?*"

.center[<img src="figs/angelucci1.png" style="width: 75%" />]

---

# Assumptions for Identification: Conditional Independence

Another case is the <g>**conditional independence assumption**</g>
`$$Y_{i}(1), Y_{i}(0) \quad \bot \quad T_i | X_i \quad \forall i,$$`
where `$X_i$` is a vector of covariates. Formally, conditional independence is equivalent to
`$$F(Y_{i}(j) | X_i) = F(Y_{i}(j) | T_i = j, X_i) = F(Y_{i} | T_i = j, X_i) \quad \text{for } j = 0,1$$`
Here, "controlling for `$X_i$`" identifies `$\tau_{ATE}$`:
`$$\tau_{ATE} = \mathbb{E} \left[ Y_{i}(1) - Y_{i}(0) \right] = \int \left( \mathbb{E} \left[ Y_{i} | T_i = 1, X_i \right] - \mathbb{E} \left[ Y_{i} | T_i = 0, X_i \right] \right) dF(X_i)$$`
This setting is also know as matching: contrasting treated and controls for each value `$X_i$`

- Usual approach in <o>natural experiments</o> (quasi-random variation in `$T_i$`)

---

# Assumptions for Identification: Lack of Independence and IV

Finally, we might face the situation where our policy `$T_i$` is not (conditionally) exogenous
`$$Y_{i}(1), Y_{i}(0) \quad \not\perp \quad T_i | X_i \quad \forall i,$$`
In this case, we need a `$D_i$` variable that does satisfy the <g>independence assumption</g>
`$$Y_{i}(1), Y_{i}(0) \quad \perp \quad D_i | X_i \quad \forall i$$`
and also satisfies the <g>relevance condition</g>
`$$D_i \quad \not\perp \quad T_i | X_i \quad \forall i,$$`
---

# Assumptions for Identification: Lack of Independence and IV

`$D_i$` is a <o>intrumental variable</o> (identifies treatment effects for a subset of the population)
`\begin{align}
Y_i &= \tau T_i + U_i \quad \quad \text{ } \text{ [structural model]} \\
Y_i &= \rho D_i + V_i \quad \quad \text{ [reduced form]} \\
T_i &= \alpha D_i + O_i \quad \quad \text{[first stage]} \\
\end{align}`
One can show that
`$$\tau_{IV} = \frac{\rho}{\alpha} = \frac{\mathbb{E}\left[Y_i | D_i = 1\right] - \mathbb{E}\left[Y_i | D_i = 0\right]}{\mathbb{E}\left[T_i | D_i = 1\right] - \mathbb{E}\left[T_i | D_i = 0\right]} \equiv \text{ Wald Estimand}$$`

- This will be <g>important for fuzzy regression</g> descontinuity designs later on

---
 
.center[
# Treatment Effects and Linear Regressions
]
---

# Treatment Effects and Linear Regressions

The potential outcomes notation has a close link with <g>linear regression models</g>
`\begin{align}
Y_i &= T_i Y_{i}(1) + (1-T_i) Y_{i}(0) \equiv \underbrace{\mathbb{E} \left[ Y_{i}(0) \right]}_{\beta_0} + \underbrace{\left[ Y_{i}(1) - Y_{i}(0) \right]}_{\beta_i} T_i + \underbrace{Y_{i}(0) - \mathbb{E} \left[ Y_{i}(0) \right]}_{U_i}
\end{align}`
An OLS regression estimates `$\mathbb{E} \left[ \bar{Y}_T - \bar{Y}_C\right] = \mathbb{E} \left[ Y_{i}(1) - Y_{i}(0) \right] = \tau_{ATE}$` if `$(Y_i, T_i) \not\perp U_i$`

This is a close analoge to selection bias. To see that, suppose
`$$Y_i := \beta_0 + \beta T_i + \gamma C_i + U_i \equiv \beta_0 + \beta T_i + \bar{U}_i$$`
One can show that the unconditional OLS estimate `$\hat{\beta}$` has an <o>omitted variable bias</o>
`$$\hat{\beta} \overset{p}{\to} \beta + \gamma \frac{var(C_i,T_i)}{var(T_i)} \neq \beta$$`

---
 
.center[
# Taking stock: why Treatment Effects

# for RD Designs in Development Economics?
]
---

# Taking Stock: Treatment Effects and RDD

This course is about the <o>potentials and applications of RDD</o> in development economics

- How does it relate to treatment effects and all the assumptions above?

In developing settings, policies or experiments such that `$Y_{i}(1), Y_{i}(0) \perp T_i | X_i$` <g>are rare, costly, and/or ethically unfeasible</g>

- It is also hard to find the right `$D_i$` instrument (again, more on that later)

---

# RDD and Treatment Effects

**Regression discontinuities** can be an alternative in these settings

- It exploits the fact that policy elegibility is usually <g>threshold-based</g>

- This creates a (quasi-) natural experiment around the threshold

- Just above/below, the threshold individuals should be similar

From 1990s onwards, part of the <o>credibility revolution</o> (Angrist and Pischke, 2010) as a versatile tool for causal inference

"Explosion" of RD studies (causally) evaluating policies in economics

> *Lee and Lemieux (2010) "<g>Regression discontinuity designs in economics</g>", Journal of Economic Literature*

---

# RDD and Treatment Effects

Comprehensive survey on (by then, but still) state-of-art RDD methods and applications
 
> *Cattaneo, M.D., Idrobo, N. and Titiunik, R., (2019) "A practical introduction to regression discontinuity designs: <o>Foundations</o>", Cambridge University Press*

- State-of-art techniques and tools [`rd` packages](https://rdpackages.github.io/)
> *Cattaneo, M.D., Idrobo, N. and Titiunik, R., (2023) "A practical introduction to regression discontinuity designs: <o>Extensions</o>"*

- Extensions on theory and tools (local randomization, fuzzy RD, ...)

---

# Lee and Lemieux (2010): RDD Examples

.center[<img src="figs/lee1.png" style="width: 75%" />]

---

.center[<img src="figs/lee7.png" style="width: 75%" />]
.center[<img src="figs/lee2.png" style="width: 75%" />]

---

.center[<img src="figs/lee7.png" style="width: 75%" />]
.center[<img src="figs/lee3.png" style="width: 75%" />]

---

.center[<img src="figs/lee7.png" style="width: 75%" />]
.center[<img src="figs/lee4.png" style="width: 75%" />]

---

.center[<img src="figs/lee7.png" style="width: 75%" />]
.center[<img src="figs/lee5.png" style="width: 75%" />]
.center[<img src="figs/lee6.png" style="width: 75%" />]

---
 
.center[
# References
]
---

## References

- Angelucci, M. and De Giorgi, G., 2009. Indirect effects of an aid program: how do cash transfers affect ineligibles' consumption?. *American economic review*, 99(1), pp.486-508.

- Angrist, J.D. and Pischke, J.S., 2010. The credibility revolution in empirical economics: How better research design is taking the con out of econometrics. *Journal of economic perspectives*, 24(2), pp.3-30.

- Banerjee, A., Duflo, E., Glennerster, R. and Kinnan, C., 2015. The miracle of microfinance? Evidence from a randomized evaluation. American economic journal: Applied economics, 7(1), pp.22-53.

- Card, D. and Krueger, A.B., 1993. Minimum wages and employment: A case study of the fast food industry in New Jersey and Pennsylvania.

- Campbell, D.T. and Cook, T.D., 1979. Quasi-experimentation. Chicago, IL: *Rand Mc-Nally*, 1(1), pp.1-384.

---

# References

- Hirano, K., Imbens, G.W. and Ridder, G., 2003. Efficient estimation of average treatment effects using the estimated propensity score. *Econometrica*, 71(4), pp.1161-1189.

- Hsiao, A., 2023. Educational Investment in Spatial Equilibrium: Evidence from Indonesia. Working Paper.

- Kaboski, J.P. and Townsend, R.M., 2012. The impact of credit on village economies. *American Economic Journal: Applied Economics*, 4(2), pp.98-133.

- Lee, D.S. and Lemieux, T., 2010. Regression discontinuity designs in economics. Journal of economic literature, 48(2), pp.281-355.

- Pitt, M.M. and Khandker, S.R., 1998. The impact of group-based credit programs on poor households in Bangladesh: Does the gender of participants matter?. *Journal of political economy*, 106(5), pp.958-996.

- Pietrobon, D., 2024. The dual role of insurance in input use: Mitigating risk versus curtailing incentives. *Journal of Development Economics*, 166, p.103203.

---

# References

- Ramos, A., 2016. Household decision making with violence: Implications for conditional cash transfer programs. *Working Paper*