Horseshoes and Hand Grenades

In a week full of being annoyed or worse at seemingly every turn, I was again confronted by a big pet peeve of mine: the unilateral dismissal by researchers of whichever estimator they happen to find unpalatable for whatever reason. 



There are only two reasons to completely dismiss an estimator. First, if the assumptions required of the estimator are strictly stronger than another estimator. Second, the estimator performs very poorly in finite samples even when the assumptions hold.

This object of this week's disdain was Instrumental Variables (IV). While I, myself, have warned several times of the temperamental nature of IV on this blog (see here and here), IV is a worthy tool in the empiricist's tool kit.  


If you need to be convinced, simply simulate some data that correspond to the following data structure and estimate.


Here, Y is the outcome, X is the endogenous regressor, U are unobserved determinants of Y, and Z is the instrument. Miraculous!

Formally, consistency of IV requires three assumptions (although some people condense them into two):

A1. First-Stage. Cov(Z,X) ≠ 0. The instrument must be correlated with the endogenous regressor. In the Directed Acyclic Graph (DAG), this is represented by an arrow from Z to X.

A2. Exogeneity. Cov(Z,U) = 0. The instrument must be uncorrelated with unobserved determinants of the outcome. In the DAG, this is represented by the absence of an arrow from Z to U.

A3. Excludability. Cov(Z,Y|X) = 0. The instrument must be correlated with the outcome only through the endogenous regressor. Equivalently, there is no direct effect of the instrument on the outcome. In the DAG, this is represented by the absence of an arrow from Z to Y.

Sometimes A2 and A3 are combined and they both can be expressed by the assumption of no back-door paths from Z to Y; the only path from Z to Y goes through X. [Paul, are you impressed?] 

Princess Bride Knows So Much GIF - PrincessBride KnowsSoMuch ...

Under A1-A3, IV produces a consistent estimate of the parameter of interest, β.  Given this, there is no reason to unilaterally reject the IV estimator as a viable option unless it's finite sample properties are poor. And, well, they can be. But ... in ways that are transparent to the researcher. As stated in my previous posts, IV is biased in finite samples. And, this bias is worse when the instrument is weak (i.e., Cov(Z,X) ≈ 0). However, the sample size is known and tests for first-stage strength are readily available.

So, why then might some researchers be so bold as to summarily dismiss this wonderful piece of econometric machinery? I think the answer lies in the plausibility of all of the assumptions necessary for identification holding simultaneously, where we can replace Assumption A1 with a stronger, more practical version:

A1'. First-Stage. Cov(Z,X) >> 0. The instrument must be strongly correlated with the endogenous regressor. 

Finding an instrument that is strongly correlated with X, but uncorrelated with U and has no direct effect on Y itself is, to say the least, a challenge. Instruments that are arguably excludable and exogenous are often weakly correlated with X. Instruments that are strongly correlated with X are often difficult to justify as exogenous and/or excludable. This is quite a conundrum we find ourselves in. 

Season 3 Netflix GIF by Gilmore Girls - Find & Share on GIPHY

[I imagine another reason one might dismiss IV is that with heterogeneous treatment effects, the parameter being estimated -- the Local Average Treatment Effect (LATE) -- may be deemed to be uninteresting. We'll leave that for another day. Or never.]

Anyway, I don't dispute any of this. But, I think what many empirical researchers are unaware of are the fancy shmansy new toys that econometricians have come up with for us. All we have to do it put them to use. If we do, not only might we learn something interesting, but we also have again found a way to distinguish the methods we use from the run-of-the-mill applied paper. 

In particular, there are a plethora of papers over the past decade that enable us to learn something even if the required assumptions for IV to be consistent do not hold exactly. Sometimes things do not have to be spot on. Sometimes close is good enough. As in ... horseshoes and hand grenades.

With A Hand Grenade GIFs - Get the best GIF on GIPHY

There are too many papers in this area to bore you with all the details, but suffice it to say, it is a very interesting literature that is (mostly) easily accessible to applied researchers. 

In one branch of this literature, Conley et al. (2012) and van Kippersluis & Rietveld (2018) relax the excludability assumption, A3. In the DAG, they allow for an arrow going directly from Z to Y. Assuming a linear regression framework, the data-generating process is assumed to be

Y = a + bX + cZ + U
X = d + eZ + V

Whereas A3 requires c = 0, these papers relax this assumption. As an aside, one of the reasons I love the Conley et al. (2012) paper is because instead of simply giving up and moving on (likely to a difference-in-differences paper) when c ≠ 0, they dig in, follow the math, and develop a solution. A good lesson for young (and old) researchers. 


Anyway, Conley et al. (2012) begin by noting that if c were known, a simple solution is immediately available: move cZ to the left-hand side of the equation and apply the traditional IV estimator to the new estimating equation

Y-tilde = a + bX + U,

where Y-tilde = Y - cZ. Since in practice c is unknown, then the authors suggest conducting a grid search over plausible values of c and obtaining the union of the confidence intervals for the IV estimates of b-hat(c), where b-hat(c) is the IV estimate of b for a given value of c. This is available in Stata with the command -plausexog-.

Again, aside from the specifics of this case, this method illustrates another useful trick for researchers. If your life would be easy if you knew something, but you don't know this something, then do a grid search. Grid searches are highly useful! This is a version of partial identification (see my previous post here). 



van Kippersluis & Rietveld (2018) extend the Conley et al. (2012) approach to situations where Z is uncorrelated with X in a particular sub-sample of one's data. For example, among the sub-sample of always-takers, Z and X are independent. In this sub-sample, Ordinary Least Squares (OLS) can be used to estimate the direct effect of Z on Y as the endogeneity of X does not bias the coefficient on Z. This allows one to pin down c and apply the Conley et al. (2012) IV estimator on the full sample without needing to resort to a grid search (unless one wishes to do a small grid search to account for the estimation uncertainty associated with the OLS estimate of c-hat).

I Can Live With That GIFs | Tenor

In another branch of this literature, Ashley (2009), Nevo & Rosen (2012), Kiviet and Niemczyk (2014), Ashley and Parmeter (2015), and  Liu et al. (2020) relax the exogeneity assumption, A2. In the DAG, they allow for an arrow going from Z to U. Relaxing this assumption implies that Cov(Z,X) ≠ 0 and Cov(Z,U) ≠ 0.

Nevo & Rosen (2012) begin by placing two restrictions on Cov(Z,X) and Cov(Z,U). First, they assume that they are of the same sign. Second, they assume that X is 'more endogenous' than Z. That is, they assume the correlation between X and U is greater (in absolute value) than the correlation between Z and U. The authors show that if the ratio of the correlation between X and U and the correlation between Z and U is known, a valid IV for X can be constructed. However, this is typically unknown. But, under their assumptions, this ratio must lie in the unit interval. By conducting a grid search over the unit interval, one can derive bounds on the coefficient of interest. This is available in Stata with the command -imperfectiv-.

Another grid search. Another use of partial identification.


Ashley (2009), Kiviet and Niemczyk (2014), and Ashley & Parameter (2015) focus on the asymptotic distribution rather than estimation per se. These papers are concerned with the impact that a failure of exogeneity has on this asymptotic distribution of the IV estimator. Kiviet and Niemczyk (2014) take it a step further and compare the distribution of IV with strong, but invalid, instruments and with weak but valid instruments. They often find the former is preferable.


You and me, both. Let's move on. 

Liu et al. (2020), in a very interesting twist, show that identification and estimation are possible under two-way exclusion restrictions. Specifically, the setup requires a variable that determines the endogenous regressor, X, but not the outcome, Y. However, this variable may be correlated with the error term, U. But, then an additional variable is required that also affects the outcome, Y, but not the endogenous regressor, X. This is a very innovative setup that would be nice to see empirical researchers put into practice.

Scorecard GIFs - Get the best GIF on GIPHY

In yet another branch of the literature, Small (2007) and Kang et al. (2016) focus on over-identified models (where the number of instruments, Z, exceed the number of endogenous regressors, X). Small (2007) focuses on sensitivity of the IV estimate to failure of the IV assumptions. Kang et al. (2016) show that identification and estimation are feasible if no more than half of the instruments are invalid, even when researchers don’t know which instruments are valid and which are not.

Finally, in a recent working paper, Chen et al. (2020) explicitly consider partial identification of the Average Treatment Effect (ATE) and Average Treatment Effect on the Treated (ATT) when the instrument is invalid. 

Wow! Who's exhausted?

Let Me Explain No GIF - LetMeExplain No TooMuch - Discover & Share ...

Yes, we need to be careful, but IV offers a world of potential to identify parameters that may otherwise prove elusive. And, while the assumptions required may be difficult to satisfy in practice, the solution is not to dismiss the estimator. Sensitivity analyses, partial identification, two-way exclusion restrictions offer numerous opportunities for empirical researchers. Econometricians seem to know a thing or two!

UPDATE (8.2.20)

I neglected to mention two other recent working papers that I had come across building on this literature. See Fan & Wu (2020) and Hartford et al. (2020), both of which build on the Kang et al. (2016) paper mentioned above.

And a third recent working paper, by Masten & Poirier (2020), that has an abstract that sounds really, really cool! 

UPDATE (8.23.20)

Thanks to Martin Huber for pointing out a highly relevant paper of his (Huber 2014) of which I was sadly unaware. The paper focuses on the use of IV to estimate the local average treatment effect of a binary treatment, where either the excludability of the instrument may not hold or the monotonicity assumption (i.e., no defiers) may not hold.

UPDATE (3.11.21)

Thanks to Rusty Tchernis for pointing out a highly relevant Bayesian paper (Kraay 2012) that addresses the same issues discussed here.

References

Ashley, R. (2009), "Assessing the Credibility of Instrumental Variables Inference with Imperfect Instruments via Sensitivity Analysis," Journal of Applied Econometrics, 24, 325-337

Ashley, R.A. and C.F. Parmeter (2015), "Sensitivity Analysis for Inference in 2SLS/GMM Estimation with Possibly Flawed Instruments," Empirical Economics, 49, 1153-1171

Chen, X., C.A. Flores, and A. Flores-Lagunes (2020), "Bounds on Average Treatment Effects with an Invalid Instrument: An Application to the Oregon Health Insurance Experiment," unpublished manuscript.

Conley, T.G., C.B. Hansen, P.E. Rossi (2012), "Plausibly Exogenous," Review of Economics and Statistics, 94, 260-272


Hartford, J., V. Veitch, D. Sridhar, and K. Leyton-Brown (2020), "Valid Causal Inference with (Some) Invalid Instruments," unpublished manuscript

Huber, M. (2014), "Sensitivity Checks for the Local Average Treatment Effect," Economics Letters, 123, 220-223

Kang, H., A. Zhang, T.T. Cai, and D.S. Small (2016), "Instrumental Variables Estimation With Some Invalid Instruments and its Application to Mendelian Randomization," Journal of the American Statistical Association, 111, 132-144

Kiviet, J.F. and Niemczyk, J. (2014), "On the Limiting and Empirical Distributions of IV Estimators When Some of the Instruments are Actually Endogenous," Advances in Econometrics (Essays in Honor of Peter C. B. Phillips), 33, 425-490 

Kraay, A. (2012), "Instrumental Variables Regressions with Uncertain Exclusion Restrictions: A Bayesian Approach," Journal of Applied Econometrics, 27, 108-128

Liu, S, I. Mourifié, and Y. Wan (2020), "Two-way Exclusion Restrictions in Models with Heterogeneous Treatment Effects," Econometrics Journal, forthcoming

Masten, M. and A. Poirier (2020), “Salvaging Falsified Instrumental Variable Models," unpublished manuscript

Nevo, A. and A.M. Rosen (2012), "Identification with Imperfect Instruments," Review of Economics and Statistics, 94,659-671

Small, D.S. (2007), "Sensitivity Analysis for Instrumental Variables Regression With Overidentifying Restrictions," Journal of the American Statistical Association, 102, 1049-1058

van Kippersluis, H. and C.A. Rietveld (2018), "Beyond Plausibly Exogenous," Econometrics Journal, 21, 316-331

Popular posts from this blog

There is Exogeneity, and Then There is Strict Exogeneity

Faulty Logic?

Different, but the Same