class: center, middle, inverse, title-slide .title[ # Lecture 02 - Regression Discontinuity Designs
in Development Economics ] .subtitle[ ## Theory and Practice ] .author[ ### Bruno Conte ] .institute[ ### Barcelona School of Economics ] .date[ ### 09/Jul/2024 ] --- <style> g { color: rgb(0,130,155) } o { color: rgb(240,138,33) } </style> # RDD in Development Economics: Schedule 1. ~~Introduction to treatment effects and RDD~~ **[08/Jul/2024]** 2. Sharp RD Design (Basics) **[09/Jul/2024]** - RD Basics: continuity-based, sharp RD approach - RD plots and (local polynomial) estimation <br> <br> 3. Sharp RD Design (Extensions) + Fuzzy RD Design **[10/Jul/2024]** 4. Spatial Regression Discontinuities **[11/Jul/2024]**<br> <br> 5. Additional RDD Methods **[12/Jul/2024]** --- # Main References for this Class 1. Cattaneo, M.D., Idrobo, N. and Titiunik, R., 2019. <g>A practical introduction to regression discontinuity designs: Foundations</g>. *Cambridge University Press*. - Chapters 1 to 4 2. Lee, D.S. and Lemieux, T., 2010. Regression discontinuity designs in economics. *Journal of economic literature*, 48(2), pp.281-355. --- # Treatment Effects and RD Designs <u>**This course**</u>: <o>potentials and applications of RDD</o> in development economics - Policies/experiments `\(T_i\)` such that `\(Y_{i}(1), Y_{i}(0) \perp T_i | X_i\)` are rare, costly, and/or unethical - It is also hard to find the right `\(D_i\)` instrument <br> <u>**Regression discontinuities**</u>: alternative in these settings - Policy elegibility is usually <g>threshold-based</g> - This creates a (quasi-) natural experiment around the threshold - Just above/below, the threshold individuals should be similar --- # Sharp RD Basics: Introduction Consider an treatment `\(T_i\)` such that `\(Y_i(1), Y_i(0) \not\perp T_i | X_i\)` (selection bias) However, if <g>threshold-based</g>, the policy creates a quasi-experiment `$$\begin{equation}T_i=\begin{cases}1 \text{ if } X_i \geq c \\ 0 \text{ if } X_i < c\end{cases}\end{equation}$$` Intuition: natural experiment around the threshold `\(c\)` (comparable individuals above/below) - Social Benefits conditional on family income - Test-based public recruiting (Universities, public employment, etc) - Poverty index-based development policies (e.g., microcredit) <u>**Sharp RD**</u>: `\(T_i\)` is a binary, deterministic function of `\(X_i\)` --- # Sharp RD Basics: Introduction Canonical example: > *Lee, D.S., 2008. Randomized experiments from non-random selection in US House elections. <g>Journal of Econometrics</g>, 142(2), pp.675-697* How does (US) party incumbency affects performance in future elections? - What is the effect of winning a seat in subsequent elections? - US House of Representatives: <o>majority rule</o> between Democratic or Republican **Intuition**: in close elections, candidate is (quasi-) randomly assigned --- .center[<img src="figs/lee2008_01.png" style="width: 70%" />] --- # Sharp RD Basics: Introduction Example 2: > *Britto, D.G., 2022. The employment effects of lump-sum and contingent job insurance policies: Evidence from Brazil. <g>Review of Economics and Statistics</g>, 104(3), pp.465-482.* How do unemployed workers react to cash-on hand? - What is the effect of unemployment benefits on labor search behaviour? - In Brazil, workers with income `\(< 2\)` minimum wages are entitled to a cash grant **Intuition**: compare unemployment outcomes for barely elegible/inelegible individuals --- .center[<img src="figs/britto2022.png" style="width: 70%" />] --- # Sharp RD Basics: Introduction Example 3: > *Mejía, D., Restrepo, P. and Rozo, S.V., 2017. On the effects of enforcement on illegal markets: evidence from a quasi-experiment in Colombia. <g>The World Bank Economic Review</g>, 31(2), pp.570-594.* Do anti-drug enforcement policies reduce cocaine production? - What is the effect of aerial spraying targeting cultivation regions? - Major policies in Colombia, but not close to Ecuadorian border (diplomatic frictions) **Intuition**: compare coca cultivation in regions inside/outside the diplomatic border --- `$$\quad$$` .pull-left[ <img src="figs/mejia1.png" style="width: 100%" /> ] .pull-right[ <img src="figs/mejia2.png" style="width: 100%" /> ] -- .center[This is an example of a <g>fuzzy RD design</g> (more on that later)!] --- `$$\quad$$` .pull-left[ <img src="figs/mejia1.png" style="width: 100%" /> ] .pull-right[ <img src="figs/mejia3.png" style="width: 100%" /> ] -- .center[How do we <g>formalize and estimate</g> the effect of such discontinuities?] --- <br><br><br><br><br><br> .center[ # Getting started: Formalizing # Regression Discontinuity Designs ] --- # Formalizing RD Designs A RD design has **three fundamental components** 1. A continuous <g>score variable</g> `\(X_i\)` (also know as forcing/running variable) 2. A score cutoff `\(c \in X_i\)` (*normalized to zero*) 3. A treatment `\(T_i\)` assignment (based on `\(i\)`'s relative score `\(X_i\)` to cutoff) <br> <o>Key defining feature:</o> probability of treatment changes discontinuously at the cutoff `\(c\)` - Advantage: this condition can be tested in the data! --- # Formalizing RD Designs: Sharp versus Fuzzy RD Designs .pull-left[ In RD designs, the assignment rule <g>is known</g> <u>Sharp RD</u>: `\(T_i = 1 \left( X_i \geq c \right)\)` ] .pull-right[ <img src="figs/cattaneo1.png" style="width: 100%" /> ] --- # Formalizing RD Designs: Sharp versus Fuzzy RD Designs .pull-left[ In RD designs, the assignment rule <g>is known</g> <u>Sharp RD</u>: `\(T_i = 1 \left( X_i \geq c \right)\)` <u>Fuzzy RD</u>: `\(\mathbb{P}(\text{Treatment} | X_i = c)\)` changes discontinuously at `\(X_i = c\)` ] .pull-right[ <img src="figs/cattaneo2.png" style="width: 100%" /> ] --- # Formalizing RD Designs: Sharp versus Fuzzy RD Designs .pull-left[ In RD designs, the assignment rule <g>is known</g> <u>Sharp RD</u>: `\(T_i = 1 \left( X_i \geq c \right)\)` <u>Fuzzy RD</u>: `\(\mathbb{P}(\text{Treatment} | X_i = c)\)` changes discontinuously at `\(X_i = c\)` <br> For now, <g>we focus on Sharp RD</g> - How to exploit the discontinuity to estimate treatment effects? ] .pull-right[ <img src="figs/cattaneo1.png" style="width: 100%" /> ] --- # Formalizing Sharp RD Designs <u>**Recall:**</u> our aim is to estimate the effects of `\(T_i\)` on `\(Y_i = T_i Y_i(1) + (1 - T_i) Y_i(0)\)`, but `$$Y_i(1), Y_i(0) \not\perp T_i | X_i$$` -- To overcome that, RD assumes treatment discontinuity `$$\lim_{x \downarrow c} \mathbb{P}(T_i = 1 | X_i = x) \neq \lim_{x \uparrow c} \mathbb{P}(T_i = 1 | X_i = x)$$` and <g>continuity of potential outcomes</g> `$$\lim_{x \downarrow c} \mathbb{E}\left[Y_i(j) | X_i = x\right] = \lim_{x \uparrow c} \mathbb{E}\left[Y_i(j) | X_i = x\right] \quad \forall j = 0,1$$` These are the assumptions of the <o>continuity-based</o> RD approach - Last class: alternative *local randomization approach* --- .pull-left[ We know that comparing treat/control `$$\tau = \mathbb{E}\left[ Y_i | T_i = 1 \right] - \mathbb{E}\left[ Y_i | T_i = 0 \right]$$` leads to selection bias (why?) ] .pull-right[ <br> <img src="figs/cattaneo3.png" style="width: 100%" /> ] --- .pull-left[ We know that comparing treat/control `$$\tau = \mathbb{E}\left[ Y_i | T_i = 1 \right] - \mathbb{E}\left[ Y_i | T_i = 0 \right]$$` leads to selection bias (why?) <u>**RD approach**</u>: compare units just close to `\(c\)` `\begin{align} \tau_{SRD} &= \mathbb{E}\left[ Y_i(1) - Y_i(0) | X_i = c \right] \\ \quad \\ &= \mathbb{E}\left[ Y_i(1) | X_i = c \right] - \mathbb{E}\left[ Y_i(0) | X_i = c \right] \\ \quad \\ &= \mu_{+} - \mu_{-} \end{align}` ] .pull-right[ <br> <img src="figs/cattaneo3.png" style="width: 100%" /> ] --- .pull-left[ We know that comparing treat/control `$$\tau = \mathbb{E}\left[ Y_i | T_i = 1 \right] - \mathbb{E}\left[ Y_i | T_i = 0 \right]$$` leads to selection bias (why?) <u>**RD approach**</u>: compare units just close to `\(c\)` `\begin{align} \tau_{SRD} &= \mathbb{E}\left[ Y_i(1) - Y_i(0) | X_i = c \right] \\ \quad \\ &= \mathbb{E}\left[ Y_i(1) | X_i = c \right] - \mathbb{E}\left[ Y_i(0) | X_i = c \right] \\ \quad \\ &= \mu_{+} - \mu_{-} \end{align}` <o>RD is all about</o> estimating `\(\mu_{+}\)` and `\(\mu_{-}\)` - Issue: we do not observe `\(Y_i(1)\)` and `\(Y_i(0)\)` ] .pull-right[ <br> <img src="figs/cattaneo3.png" style="width: 100%" /> ] --- .pull-left[ RD relies on <g>continuity assumptions</g> `\begin{align} \tau_{SRD} &= \mathbb{E}\left[ Y_i(1) - Y_i(0) | X_i = c \right] \\ \quad \\ &= \underbrace{\lim_{x \downarrow c}\mathbb{E}\left[ Y_i | X_i = c \right]}_{\text{Estimable}} - \underbrace{\lim_{x \uparrow c}\mathbb{E}\left[ Y_i | X_i = c \right]}_{\text{Estimable}} \\ &= \mu_{+} - \mu_{-} \end{align}` to approximate `\(\mu_{+}\)` and `\(\mu_{-}\)` with observ. `\(Y_i\)` <u>How to approximate</u> `\(\lim\)` in practice? ] .pull-right[ <br> <img src="figs/cattaneo3.png" style="width: 100%" /> ] --- .pull-left[ RD relies on <g>continuity assumptions</g> `\begin{align} \tau_{SRD} &= \mathbb{E}\left[ Y_i(1) - Y_i(0) | X_i = c \right] \\ \quad \\ &= \underbrace{\lim_{x \downarrow c}\mathbb{E}\left[ Y_i | X_i = c \right]}_{\text{Estimable}} - \underbrace{\lim_{x \uparrow c}\mathbb{E}\left[ Y_i | X_i = c \right]}_{\text{Estimable}} \\ &= \mu_{+} - \mu_{-} \end{align}` to approximate `\(\mu_{+}\)` and `\(\mu_{-}\)` with observ. `\(Y_i\)` <u>How to approximate</u> `\(\lim\)` in practice? - Subset within a <o>narrow bandwidth</o> `\(h\)` <u>Important</u>: RD estimates are <g>local in nature</g>! - Who is the sample `\(X_i \in [c - h, c + h]\)`? ] .pull-right[ <br> <img src="figs/cattaneo3.png" style="width: 100%" /> ] --- .left-column[ We want - `\(\tau_{SRD} = \mu_{+} - \mu_{-}\)` ] .right-column[ <img src="figs/cattaneo4.png" style="width: 90%" /> ] --- .left-column[ We want - `\(\tau_{SRD} = \mu_{+} - \mu_{-}\)` We observe data ] .right-column[ <img src="figs/cattaneo5.png" style="width: 90%" /> ] --- .left-column[ We want - `\(\tau_{SRD} = \mu_{+} - \mu_{-}\)` We observe data And do not know `\(\mathbb{E}\left[Y_i(j) | X_i = x\right]\)` ] .right-column[ <img src="figs/cattaneo6.png" style="width: 90%" /> ] --- .left-column[ We want - `\(\tau_{SRD} = \mu_{+} - \mu_{-}\)` We observe data And do not know `\(\mathbb{E}\left[Y_i(j) | X_i = x\right]\)` Approximate `\(\lim\)`: - Within `\(h\)` ] .right-column[ <img src="figs/cattaneo7.png" style="width: 90%" /> ] --- .left-column[ We want - `\(\tau_{SRD} = \mu_{+} - \mu_{-}\)` We observe data And do not know `\(\mathbb{E}\left[Y_i(j) | X_i = x\right]\)` Approximate `\(\lim\)`: - Within `\(h\)` `\(\hat{\mu}_{1} = \hat{\mathbb{E}}\left[Y_i(1) | X_i = c\right]\)` ] .right-column[ <img src="figs/cattaneo8.png" style="width: 90%" /> ] --- .left-column[ We want - `\(\tau_{SRD} = \mu_{+} - \mu_{-}\)` We observe data And do not know `\(\mathbb{E}\left[Y_i(j) | X_i = x\right]\)` Approximate `\(\lim\)`: - Within `\(h\)` `\(\hat{\mu}_{1} = \hat{\mathbb{E}}\left[Y_i(1) | X_i = c\right]\)` `\(\hat{\mu}_{0} = \hat{\mathbb{E}}\left[Y_i(0) | X_i = c\right]\)` ] .right-column[ <img src="figs/cattaneo9.png" style="width: 90%" /> ] --- .left-column[ We want - `\(\tau_{SRD} = \mu_{+} - \mu_{-}\)` We observe data And do not know `\(\mathbb{E}\left[Y_i(j) | X_i = x\right]\)` Approximate `\(\lim\)`: - Within `\(h\)` `\(\hat{\mu}_{1} = \hat{\mathbb{E}}\left[Y_i(1) | X_i = c\right]\)` `\(\hat{\mu}_{0} = \hat{\mathbb{E}}\left[Y_i(0) | X_i = c\right]\)` `\(\hat{\tau}_{SRD} = \hat{\mu}_{1} - \hat{\mu}_{0}\)` ] .right-column[ <img src="figs/cattaneo10.png" style="width: 90%" /> ] --- .left-column[ <br> <br> <br> Any <o>**questions?**</o> ] .right-column[ <img src="figs/cattaneo10.png" style="width: 90%" /> ] --- .left-column[ <br> <br> <br> Is `\(\hat{\tau}_{SRD}\)` sensitive - Bandwitdht choice `\(h\)`? - Functional format? ] .right-column[ <img src="figs/cattaneo11.png" style="width: 90%" /> ] --- .left-column[ <br> <br> <br> Is `\(\hat{\tau}_{SRD}\)` sensitive - Bandwitdht choice `\(h\)`? - Functional format? ] .right-column[ <img src="figs/cattaneo12.png" style="width: 90%" /> ] --- <br><br><br><br><br><br> .center[ # How to Estimate RD Effects? # Theory and Best Practice ] --- # RD Theory and Best Practice Best practice in RD work consists of <u>3 fundamental steps</u> 1\. <g>RD plots</g>: illustrates visually the discontinuity - Transparency vis-à-vis other methods (e.g., IV) 2\. <o>RD Point Estimation</o> - Continuity-based RD: local polynomial approximation 3\. **Validation** of RD Design - Placebo, density tests, ... (tomorrow!) --- # RD Theory and Best Practice: RD Plots .pull-left[ <u>**RD plots**</u>: uncovers "hidden" data patterns <g>Meyersson (2014):</g> Islamic representation `\(\rightarrow\)` effects in female education? - Strategy: RD disc. on close elections - Units: municipalities in Turkey - Score: margin Islamic party victory - Raw data: <o>slight negative relationship</o> ] .pull-right[ <img src="figs/meyersson1.png" style="width: 100%" /> ] --- # RD Theory and Best Practice: RD Plots .pull-left[ <u>**RD plots**</u>: combination of 1. Binned scatter plot 2. Global polynomial fits `\((\neq \text{ local?})\)` - Both sides of `\(c\)` (using raw data) <g>Meyersson (2014):</g> close victories `\(\rightarrow\)` higher female education - Technical aspects in the practical class RD plots adds <o>transparency</o> to studies ] .pull-right[ <img src="figs/meyersson2.png" style="width: 100%" /> ] --- # RD Theory and Best Practice: RD Point Estimation .pull-left[ <u>Illustration</u>: we estimate `\(\mu_{+}\)` with `$$Y_i = \beta_0 + \beta_1 X_i \quad \forall X_i \in [c, c+ h]$$` where `\(\hat{\beta}_0 \equiv \hat{\mu}_{+}\)` (normalized `\(X_i\)`) - This regression is the estimated <g>local polynomial</g> (of order 1) - What about `\(\neq h\)` or functional formats? - Regression weights, ...? ] .pull-right[ <img src="figs/cattaneo8.png" style="width: 100%" /> ] --- # RD Theory and Best Practice: RD Point Estimation Best practice in RD work consists of <u>3 fundamental steps</u> 1. Choose a polynomial order `\(p\)`, kernel function `\(K(.)\)` (weights), and bandwidth `\(h\)` 2. For treated/control samples, fit weighted least square regressions of `\(Y_i\)` `$$Y_i = \mu_{+} + \sum\limits_{r = 1}^p \mu_{+}^r (X_i - c)^r + \varepsilon_i \quad \forall X_i \in [c, c+h] \\ Y_i = \mu_{-} + \sum\limits_{r = 1}^p \mu_{-}^r (X_i - c)^r + \varepsilon_i \quad \forall X_i \in [c-h, c]$$` 3. Calculate Sharp RD estimate `\(\hat{\tau}_{SRD} = \hat{\mu}_{+} - \hat{\mu}_{-} \equiv \hat{\mathbb{E}}\left[ Y_i(1) - Y_i(0) | X_i = c \right]\)` --- # RD Theory and Best Practice: RD Point Estimation <u>Final remarks</u> on polynomial order `\(p\)`, kernel function `\(K(.)\)` (weights), and bandwidth `\(h\)`: - Polynomial order `\(p\)` and kernel function `\(K(.)\)`: "arbitrary" choices - Results must be robust to `\(\neq\)` choices (validation) - Bandwidth `\(h\)` can be optimally retrieved with a <g>data-driven procedure</g> - Practical session today! --- # RD Theory and Best Practice: RD Point Estimation On RD point estimation, one can <u>simplify points 2 and 3</u> and fit `$$Y_i = \mu_{-} + \sum\limits_{r = 1}^p \mu_{-}^r (X_i - c)^r + \beta T_i + \sum\limits_{r = 1}^p \gamma^r (X_i - c)^r T_i + \varepsilon_i \quad \forall X_i \in [c-h, c+h]$$` where `\(\hat{\beta} \equiv \hat{\tau}_{SRD}\)` (why is that so?) This is the <o>common specification</o> used in applied research -- ## This afternoon - Practical aspects of RD plots and RD estimation - Tomorrow: theory and practice of RD validation (and more) --- <br><br><br><br><br><br> .center[ # Thank you and # see you later! ] --- # References - Britto, D.G., 2022. The employment effects of lump-sum and contingent job insurance policies: Evidence from Brazil. *Review of Economics and Statistics*, 104(3), pp.465-482. - Meyersson, E., 2014. Islamic Rule and the Empowerment of the Poor and Pious. *Econometrica*, 82(1), pp.229-269.