class: center, middle, inverse, title-slide .title[ # Practice 02 - Regression Discontinuity Designs
in Development Economics ] .subtitle[ ## Theory and Practice ] .author[ ### Bruno Conte ] .institute[ ### Barcelona School of Economics ] .date[ ### 09/Jul/2024 ] --- <style> g { color: rgb(0,130,155) } o { color: rgb(240,138,33) } </style> # RDD in Development Economics: Practical Class 02 <u>Today's goal</u>: put in practice the basic RD Sharp concepts with `rdrobust` - Recall: Best-practice RD work consists of <g>3 fundamental steps</g> 1. <o>RD Plots:</o> visually inspection of discontinuity in observed outcomes 2. <g>RD Point Estimation:</g> done with local polynomial approximation (i.e., continuity-based approach) 3. ~~RD Validation~~ (tomorrow!) **Material** for this class [<u>here</u>](https://www.dropbox.com/scl/fi/7llotj2ke1shw6p5qpgej/00_practice02.zip?rlkey=md2li6ergf7ko74x2tulz7fx2&st=go5xv2sb&dl=1). Installation: ```r install.packages('rdrobust') ``` ```stata net install rdrobust, from(https://raw.githubusercontent.com/rdpackages/rdrobust/master/stata) replace ``` --- <br><br><br><br><br><br> .center[ # RD Plots ] --- # Practical Class 02: RD Plots .pull-left[ <u>**RD plots**</u>: uncovers "hidden" data patterns <g>Meyersson (2014):</g> Islamic representation `\(\rightarrow\)` effects in female education? - Strategy: RD disc. on close elections - Units: municipalities in Turkey - Score: margin Islamic party victory - Raw data: <o>slight negative relationship</o> - RD Plot: discontinuous, <g>positive effect</g>! ] .pull-right[ <img src="figs/meyersson1.png" style="width: 100%" /> ] --- # Practical Class 02: RD Plots .pull-left[ <u>**RD plots**</u>: uncovers "hidden" data patterns <g>Meyersson (2014):</g> Islamic representation `\(\rightarrow\)` effects in female education? - Strategy: RD disc. on close elections - Units: municipalities in Turkey - Score: margin Islamic party victory - Raw data: <o>slight negative relationship</o> - RD Plot: discontinuous, <g>positive effect</g>! ] .pull-right[ <img src="figs/meyersson2.png" style="width: 100%" /> ] --- # Practical Class 02: RD Plots .pull-left[ Raw plots, loading `data_meyersson.csv` ```r dset <- read.csv('data_meyersson.csv') plot(dset$X, dset$Y) ``` ] .pull-right[ <img src="figs/class02/unnamed-chunk-4-1.png" width="90%" /> ] --- # Practical Class 02: RD Plots .pull-left[ Raw plots, loading `data_meyersson.csv` ```r dset <- read.csv('data_meyersson.csv') plot( dset$X, dset$Y, xlab = "Score", # label x axis ylab = "Outcome", # label y axis col = 1, # black color pch = 20 # circle dots ) abline(v=0) # add vertical line X=0 ``` ] .pull-right[ <img src="figs/class02/unnamed-chunk-6-1.png" width="90%" /> ] --- # Practical Class 02: RD Plots .pull-left[ `rdplot`: part of `rdrobust` library ```r install.packages('rdrobust') library(rdrobust) dset <- read.csv('data_meyersson.csv') rdplot(dset$Y, dset$X) ``` ] .pull-right[ <img src="figs/class02/unnamed-chunk-8-1.png" width="90%" /> ] --- # Practical Class 02: RD Plots .pull-left[ `rdplot`: part of `rdrobust` library ```r install.packages('rdrobust') library(rdrobust) dset <- read.csv('data_meyersson.csv') rdplot( dset$Y, dset$X, nbins = c(20, 20), # number of bins binselect = "es", # bin type y.lim = c(0,25) # limits y axis ) ``` ] .pull-right[ <img src="figs/class02/unnamed-chunk-10-1.png" width="90%" /> ] --- <g>Important:</g> RD plots store carry out important information (in `Stata`, printed in the console)! ```r output <- rdplot(dset$Y,dset$X,nbins = c(20, 20),binselect = "es",y.lim = c(0,25)) summary(output) ``` ``` ## Call: rdplot ## ## Number of Obs. 2629 ## Kernel Uniform ## ## Number of Obs. 2314 315 ## Eff. Number of Obs. 2314 315 ## Order poly. fit (p) 4 4 ## BW poly. fit (h) 100.000 99.051 ## Number of bins scale 1 1 ## ## Bins Selected 20 20 ## Average Bin Length 5.000 4.953 ## Median Bin Length 5.000 4.953 ## ## IMSE-optimal bins 11 7 ## Mimicking Variance bins 40 75 ## ## Relative to IMSE-optimal: ## Implied scale 1.818 2.857 ## WIMSE variance weight 0.143 0.041 ## WIMSE bias weight 0.857 0.959 ``` --- # Practical Class 02: RD Plots (R versus Stata) ## R command ```r rdplot(dset$Y,dset$X,nbins = c(20, 20),binselect = "es") ``` ## Stata equivalent ```stata rdplot Y X, nbins(20 20) binselect(es) ``` --- # Practical Class 02: RD Plots <g>Important:</g> method of bin choice `\(\rightarrow \neq\)` results (`binselect = "es"` vs `binselect = "qs"`) .center[ <img src="figs/cattaneo22.png" style="width: 80%" /> ] --- # Practical Class 02: RD Plots .pull-left[ <g>Number of bins:</g> can be optimally retrieved - Method: minimize MSE - How to? Omit `binselect` ```r rdplot( dset$Y, dset$X, nbins = c(20, 20), # number of bins binselect = "es", # bin type y.lim = c(0,25) # limits y axis ) ``` ] .pull-right[ <img src="figs/class02/unnamed-chunk-16-1.png" width="90%" /> ] --- # Practical Class 02: RD Plots .pull-left[ <g>Number of bins:</g> can be optimally retrieved - Method: minimize MSE - How to? Omit `binselect` ```r rdplot( dset$Y, dset$X, # nbins = c(20, 20), binselect = "es", # bin type y.lim = c(0,25) # limits y axis ) ``` ] .pull-right[ <img src="figs/class02/unnamed-chunk-18-1.png" width="90%" /> ] --- ```r summary(output) ``` ``` ## Call: rdplot ## ## Number of Obs. 2629 ## Kernel Uniform ## ## Number of Obs. 2314 315 ## Eff. Number of Obs. 2314 315 ## Order poly. fit (p) 4 4 ## BW poly. fit (h) 100.000 99.051 ## Number of bins scale 1 1 ## ## Bins Selected 11 7 ## Average Bin Length 9.091 14.150 ## Median Bin Length 9.091 14.150 ## ## IMSE-optimal bins 11 7 ## Mimicking Variance bins 40 75 ## ## Relative to IMSE-optimal: ## Implied scale 1.000 1.000 ## WIMSE variance weight 0.500 0.500 ## WIMSE bias weight 0.500 0.500 ``` --- # Practical Class 02: RD Plots .pull-left[ <g>Polynomial order:</g> standard is `p = 4` - What if quadratic? ```r rdplot( dset$Y, dset$X, nbins = c(20, 20), # number of bins * p = 2, # polynomial order binselect = "es", # bin type y.lim = c(0,25) # limits y axis ) ``` ] .pull-right[ <img src="figs/class02/unnamed-chunk-21-1.png" width="90%" /> ] --- # Practical Class 02: RD Plots .pull-left[ <g>Polynomial order:</g> standard is `p = 4` - What if linear? ```r rdplot( dset$Y, dset$X, nbins = c(20, 20), # number of bins * p = 1, # polynomial order binselect = "es", # bin type y.lim = c(0,25) # limits y axis ) ``` ] .pull-right[ <img src="figs/class02/unnamed-chunk-23-1.png" width="90%" /> ] --- # Wrapping Up: RD Plots <g>RD Plots:</g> crucial ingredient in RD work - Motivates and illustrates, visually, the experiment in hand - Flexible and credible implementation with `rdplot` - More details: Cattaneo et al. (2019), chapter 3 --- <br><br><br><br><br><br> .center[ # RD Point Estimation of `\(\tau_{SRD}\)` ] --- # Practical Class 02: RD Point Estimation RD point estimation of `\(\tau_{SRD}\)` with `rdrobust` - Approximation of local polynomial in both sides of `\(c \rightarrow \hat{\tau}_{SRD} = \hat{\mu}_{+} - \hat{\mu}_{-}\)` ## R version ```r rdrobust(dset$Y, dset$X, kernel = "uniform", p = 1, h = 20) ``` ## Stata version ```stata rdrobust Y X, kernel(uniform) p(1) h(20) ``` <u>Important inputs</u>: polynomial `\(p\)`, bandwidth `\(h\)`, and weights `\(K()\)` --- # Practical Class 02: RD Point Estimation .pull-left[ <u>Polynomial `\(p\)`</u>: local approach `\(\rightarrow p \leq 2\)` - Robustness to `\(p>2\)` ] .pull-right[ <img src="figs/cattaneo10.png" style="width: 100%" /> ] --- # Practical Class 02: RD Point Estimation .pull-left[ <u>Polynomial `\(p\)`</u>: local approach `\(\rightarrow p \leq 2\)` - Robustness to `\(p>2\)` <u>Kernel function `\(K()\)`</u>: weights on regression - Usual triangular (no weights = uniform) ] .pull-right[ <img src="figs/cattaneo10.png" style="width: 100%" /> ] --- # Practical Class 02: RD Point Estimation .pull-left[ <u>Polynomial `\(p\)`</u>: local approach `\(\rightarrow p \leq 2\)` - Robustness to `\(p>2\)` <u>Kernel function `\(K()\)`</u>: weights on regression - Usual triangular (no weights = uniform) ] .pull-right[ <img src="figs/cattaneo23.png" style="width: 100%" /> ] --- # Practical Class 02: RD Point Estimation .pull-left[ <u>Polynomial `\(p\)`</u>: local approach `\(\rightarrow p \leq 2\)` - Robustness to `\(p>2\)` <u>Kernel function `\(K()\)`</u>: weights on regression - Usual triangular (no weights = uniform) <u>Bandwidth `\(h\)`</u>: bias-variance trade-off - Data-driven optimal choice: `$$h^* = \text{arg}\text{min}_{h} \left[ \text{bias}(\tau_{SRD}(h))^2 + \text{variance}(\tau_{SRD}(h)) \right]$$` ] .pull-right[ <img src="figs/cattaneo10.png" style="width: 100%" /> ] --- # Practical Class 02: RD Point Estimation <g>Illustration</g> with Meyerssen (2014) - Linear and quadratic polynomials - Uniform kernel (no weights) and triangular - Bandwidth `\(h = 20\)` (only elections victory margins within -20 and 20%) ```r reg <- rdrobust(dset$Y, dset$X, kernel = "uniform", p = 1, h = 20) summary(reg) ``` --- ``` ## Sharp RD estimates using local polynomial regression. ## ## Number of Obs. 2629 ## BW type Manual ## Kernel Uniform ## VCE method NN ## ## Number of Obs. 2314 315 ## Eff. Number of Obs. 608 280 ## Order est. (p) 1 1 ## Order bias (q) 2 2 ## BW est. (h) 20.000 20.000 ## BW bias (b) 20.000 20.000 ## rho (h/b) 1.000 1.000 ## Unique Obs. 2314 315 ## ## ============================================================================= ## Method Coef. Std. Err. z P>|z| [ 95% C.I. ] ## ============================================================================= ## Conventional 2.927 1.235 2.371 0.018 [0.507 , 5.347] ## Robust - - 1.636 0.102 [-0.582 , 6.471] ## ============================================================================= ``` --- ``` ## Sharp RD estimates using local polynomial regression. ## ## Number of Obs. 2629 ## BW type Manual ## Kernel Triangular ## VCE method NN ## ## Number of Obs. 2314 315 ## Eff. Number of Obs. 608 280 ## Order est. (p) 1 1 ## Order bias (q) 2 2 ## BW est. (h) 20.000 20.000 ## BW bias (b) 20.000 20.000 ## rho (h/b) 1.000 1.000 ## Unique Obs. 2314 315 ## ## ============================================================================= ## Method Coef. Std. Err. z P>|z| [ 95% C.I. ] ## ============================================================================= ## Conventional 2.937 1.343 2.187 0.029 [0.305 , 5.569] ## Robust - - 1.379 0.168 [-1.117 , 6.414] ## ============================================================================= ``` --- ``` ## Sharp RD estimates using local polynomial regression. ## ## Number of Obs. 2629 ## BW type Manual ## Kernel Triangular ## VCE method NN ## ## Number of Obs. 2314 315 ## Eff. Number of Obs. 608 280 ## Order est. (p) 2 2 ## Order bias (q) 3 3 ## BW est. (h) 20.000 20.000 ## BW bias (b) 20.000 20.000 ## rho (h/b) 1.000 1.000 ## Unique Obs. 2314 315 ## ## ============================================================================= ## Method Coef. Std. Err. z P>|z| [ 95% C.I. ] ## ============================================================================= ## Conventional 2.649 1.921 1.379 0.168 [-1.117 , 6.414] ## Robust - - 0.420 0.674 [-3.969 , 6.135] ## ============================================================================= ``` --- # Practical Class 02: RD Point Estimation <g>Illustration</g> with Meyerssen (2014) - Linear polynomial - Triangular kernel - ~~Bandwidth `\(h = 20\)` (only elections victory margins within -20 and 20%)~~ - Data-driven bandwidth `\(h\)` choice: omit `h=20`, use `bwselect =` - `mserd`: equal `\(h\)` that minimizes MSE - `msetwo`: `\(\neq h\)` in each side ```r reg <- rdrobust(dset$Y, dset$X, kernel = "triangular", p = 1, bwselect = "mserd") summary(reg) ``` --- ``` ## Sharp RD estimates using local polynomial regression. ## ## Number of Obs. 2629 ## BW type mserd ## Kernel Triangular ## VCE method NN ## ## Number of Obs. 2314 315 ## Eff. Number of Obs. 529 266 ## Order est. (p) 1 1 ## Order bias (q) 2 2 ## BW est. (h) 17.240 17.240 ## BW bias (b) 28.576 28.576 ## rho (h/b) 0.603 0.603 ## Unique Obs. 2311 315 ## ## ============================================================================= ## Method Coef. Std. Err. z P>|z| [ 95% C.I. ] ## ============================================================================= ## Conventional 3.020 1.427 2.116 0.034 [0.223 , 5.816] ## Robust - - 1.776 0.076 [-0.309 , 6.276] ## ============================================================================= ``` --- ``` ## Sharp RD estimates using local polynomial regression. ## ## Number of Obs. 2629 ## BW type msetwo ## Kernel Triangular ## VCE method NN ## ## Number of Obs. 2314 315 ## Eff. Number of Obs. 607 267 ## Order est. (p) 1 1 ## Order bias (q) 2 2 ## BW est. (h) 19.967 17.360 ## BW bias (b) 32.279 29.729 ## rho (h/b) 0.619 0.584 ## Unique Obs. 2311 315 ## ## ============================================================================= ## Method Coef. Std. Err. z P>|z| [ 95% C.I. ] ## ============================================================================= ## Conventional 2.969 1.391 2.134 0.033 [0.243 , 5.695] ## Robust - - 1.810 0.070 [-0.245 , 6.152] ## ============================================================================= ``` --- # Wrapping Up: RD Point Estimation <g>RD Estimation:</g> easily implemented with `rdrobust` - Power toolbox for RD estimation under `\(\neq\)` parameters - Parametric choice is important, results can be sensitive - Yet to be seen: <o>**how to validate**</o> an RD design? - Tomorrow! --- <br><br><br><br><br><br> .center[ # Take-Home/Assignment Exercises ] --- # Take-Home/Assignment Questions (2/4) .pull-left[ <g>Part 1:</g> use `data_meyersson.csv` Replicate the following: - Standard RD Plot, equally sized bins ] .pull-right[ <img src="figs/class02/unnamed-chunk-33-1.png" width="90%" /> ] --- # Take-Home/Assignment Questions (2/4) .pull-left[ <g>Part 1:</g> use `data_meyersson.csv` Replicate the following: - Standard RD Plot, equally sized bins - Then, zoom in `\(h=25\)` with a linear polynomial - And estimate `\(\hat{\tau_{SRD}}\)` (uniform `\(K()\)`) ] .pull-right[ <img src="figs/class02/unnamed-chunk-34-1.png" width="90%" /> ] --- # Take-Home/Assignment Questions (2/4) .pull-left[ <g>Part 1:</g> use `data_meyersson.csv` Replicate the following: - Standard RD Plot, equally sized bins - Then, zoom in `\(h=25\)` with a linear polynomial - And estimate `\(\hat{\tau_{SRD}}\)` (uniform `\(K()\)`) <u>**Final challenge**</u>: retrieve `\(\hat{\mu_{+}} - \hat{\mu_{-}} = \hat{\tau_{SRD}}\)` - With the separate (local) linear regressions (`lm`) ] .pull-right[ <img src="figs/class02/unnamed-chunk-35-1.png" width="90%" /> ] --- # Take-Home/Assignment Questions (2/4) .pull-left[ <g>Part 2:</g> Alix-GarcĂa et al. (2013) "*The Ecological Footprint of Poverty Alleviation: Evidence from Mexico's Oportunidades Program"*" - Does this cash transfers led to deforestation? - Program: <o>threshold based</o> on a marginality index - Eligible municipalities: those with index `\(> -1.2\)` ] .pull-right[ <img src="figs/alix1.png" style="width: 100%" /> ] --- # Take-Home/Assignment Questions (2/4) .pull-left[ With `data_alixgarcia.csv` - Outcome: `pctdefor` - Score: `indice95` <g>RD Plot:</g> - Optimal number of bins ] .pull-right[ <img src="figs/class02/unnamed-chunk-36-1.png" width="90%" /> ] --- # Take-Home/Assignment Questions (2/4) .pull-left[ With `data_alixgarcia.csv` - Outcome: `pctdefor` - Score: `indice95` <g>RD Plot:</g> - Unrestricted (optimal number of bins) - With 20 bins in each side Then, <o>RD Point Estimate</o> - Use `\(h=1\)` - Multiply `pctdefor * 1e6` ] .pull-right[ <img src="figs/class02/unnamed-chunk-37-1.png" width="90%" /> ] --- ``` ## Sharp RD estimates using local polynomial regression. ## ## Number of Obs. 58587 ## BW type Manual ## Kernel Triangular ## VCE method NN ## ## Number of Obs. 3639 54948 ## Eff. Number of Obs. 3571 12279 ## Order est. (p) 1 1 ## Order bias (q) 2 2 ## BW est. (h) 1.000 1.000 ## BW bias (b) 1.000 1.000 ## rho (h/b) 1.000 1.000 ## Unique Obs. 3639 54948 ## ## ============================================================================= ## Method Coef. Std. Err. z P>|z| [ 95% C.I. ] ## ============================================================================= ## Conventional 399.298 151.453 2.636 0.008 [102.455 , 696.141] ## Robust - - 0.980 0.327 [-197.815 , 593.287] ## ============================================================================= ``` --- # Take-Home/Assignment Questions (2/4) .pull-left[ <g>Part 3:</g> Lalive (2008) "*How do extended benefits affect unemployment duration? A regression discontinuity approach"*" - Similar to Britto (2022); benefits `\(\leftrightarrow\)` unemployment duration - Austria: extension of beenfits in specific regions - <u>Stronger effects for women</u> ] .pull-right[ <img src="figs/lalive1.png" style="width: 100%" /> ] --- # Take-Home/Assignment Questions (2/4) .pull-left[ With `data_lalive.csv` - Outcome: `unemployment_duration` - What is the score `\(X_i\)` and cutoff `\(c\)`? Produce 1\. RD Plot that resembles the paper's - Warning message: discrete score! 2\. What is `\(\hat{\tau_{SRD}}\)`, and how would you interpret it? ] .pull-right[ ``` ## [1] "Mass points detected in the running variable." ``` <img src="figs/class02/unnamed-chunk-39-1.png" width="90%" /> ] --- <br><br><br><br><br><br> .center[ # Thank you and # see you tomorrow! ] --- # References - Alix-Garcia, J., McIntosh, C., Sims, K.R. and Welch, J.R., 2013. The ecological footprint of poverty alleviation: evidence from Mexico's Oportunidades program. *Review of Economics and Statistics*, 95(2), pp.417-435. - Lalive, R., 2008. How do extended benefits affect unemployment duration? A regression discontinuity approach. *Journal of econometrics*, 142(2), pp.785-806. - Meyersson, E., 2014. Islamic Rule and the Empowerment of the Poor and Pious. *Econometrica*, 82(1), pp.229-269.