Festivus Miracle
Guest post by Jonathan Roth
Many interesting questions in economics involve the causal effect of a treatment that was not randomly assigned. Luckily, empirical researchers often find creative ways to circumvent endogeneity issues.
One way to get creative is to use an instrumental variable (IV). Consider the following example: we want to know the causal effect of attending a private school (X) on test scores (Y). Define the potential outcomes of a student attending (X=1) and not attending (X=0) a private school as Y(1) and Y(0), respectively. The individual-level causal effect of attending versus not attending is the difference in potential outcomes, Y(1)-Y(0). With heterogeneity in the treatment effects, this difference will vary across all students.
If we suspect that choice of school is endogenous, we can instrument for X using information on whether you were offered a voucher that reduces the cost of private school (Z). For Z to be a valid instrument, it must obey certain properties:
Relevance. Receiving a voucher is correlated with the decision to attend a private school.
Independence. The assignment of the vouchers is as-good-as-random (not related to potential outcomes).
Monotonicity. Everyone who goes to private school without a voucher would also go with a voucher.
Exclusion. Getting a voucher only affects test scores through the choice of whether to go to private school.
The first assumption is testable. The rest … Well, I often hear people say in seminars that “the IV assumptions are inherently untestable.”
But actually, that is not quite correct.
In the case of heterogenous treatment effects, you can actually falsify the IV assumptions in certain cases!
To see this, we start by defining four mutually exclusive and exhaustive types of people:
Always-Takers. Individuals that go to private school whether they get the voucher or not.
Never-Takers. Individuals that do not go to private school whether they get the voucher or not.
Compliers. Individuals that go to private school if and only if they get the voucher.
Defiers. Individuals that go to private school if and only if they do not get the voucher.
Defiers are like children that do the exact opposite of what they are told. The monotonicity assumption assumes that defiers do not exist. Parents know otherwise.
Under the assumptions listed above, the IV estimator gives us the Local Average Treatment Effect (LATE), which is the average of Y(1)-Y(0) among the compliers in the population. In fact, if you can show this, you too can win the Sveriges Riksbank Prize in Economic Sciences in Memory of Alfred Nobel. Well, perhaps that ship has sailed.
Now, for the truly amazing part. We can test these assumptions even with only a single Z, that is in an exactly identified model. To understand this, consider the following question in the running example:
“What fraction of people who don’t get a voucher go to private school and score 95 on the test?”
Well, the only people who go to private school without a voucher are always-takers. So, the answer is the fraction of the population who are always-takers and have Y(1), the potential outcome at private school, equal to 95. Formally, this is represented by Pr(Y = 95, X = 1 | Z = 0).
Now consider the same question, except among the people who get the voucher. Well, the people who go to private school when they get the voucher are always-takers or compliers. So, the answer is the fraction of the population that is either a complier or always taker and have Y(1) equal to 95. Formally, this is represented by Pr(Y = 95, X = 1 | Z = 1).
But, this implies that there must be more people who go to private school and score 95 among the voucher group than the non-voucher group. In other words,
P( Y= 95, X = 1 | Z = 1) ≥ P(Y = 95, X=1 | Z= 0)
This is something that we can test! All the quantities in the above weak inequality – Y, X, and Z – are observed in the data. If there are more people scoring 95 at private school among the non-voucher group, then at least one of our key assumptions must be wrong!
We can thus falsify the IV assumptions if this inequality – or similar ones for other values of Y other than 95 — do not hold!
This is the idea of the tests proposed in Kitagawa (2015), Huber and Mellace (2015), and Mourifie and Wan (2017). Mourifie and Wan even have instructions for easily implementing the test in Stata!
Before we wrap up, a big caveat is in order: if you don’t reject the test, this does not mean that the IV assumptions hold! This is for two reasons. First, the inequalities described above can be satisfied even if the IV assumptions are violated. In other words, the IV assumptions are falsifiable but not verifiable. And second, even if the inequalities described above are violated in the population, we may not have precision to statistically reject the null! Similar caveats apply to pre-trends testing for difference-in-differences.
That said, these tests can nonetheless be useful for identifying cases where the IV assumptions are clearly violated! Or for sounding smart in seminars when someone says that IV assumptions aren’t testable…
Bio
Jonathan Roth is an assistant professor in the economics department at Brown University. His primary research interests are in econometrics, with a focus on causal inference. He has also have also worked on topics in labor economics, machine learning, and algorithmic fairness. Prior to joining Brown University, he was a senior researcher in the Office of the Chief Economist at Microsoft. He received his Ph.D. in economics in 2020 from Harvard University, where he was awarded the David A. Wells prize for best dissertation. He obtained a B.A. summa cum laude in mathematics and economics from the University of Pennsylvania.
References
Huber, M. and G. Mellace (2015), “Testing Instrument Validity for LATE Identification Based on Inequality Moment Constraints,” The Review of Economics and Statistics, 97(2), 398-411
Kitagawa, T. (2015), “A Test for Instrument Validity,” Econometrica, 83(5), 2043-2063
Mourifié, I. and Y. Wan (2017), “Testing Local Average Treatment Effect Assumptions,” The Review of Economics and Statistics, 99(2), 305-313
Roth, J. (2022), “Pre-test with Caution: Event-study Estimates After Testing for Parallel Trends,” AER:Insights, forthcoming