Dig a Little Deeper
After 2.5 years as department chair, I have found the mental bandwidth to return to my blog on econometric stuff for applied people. Since it is job market season, and many elevator pitches and job talks are about to given, humor me for a short rant on a topic that has long irked me.
The so-called credibility revolution in economics refers to the focus of most empirical research (by academics at least) on credible identification of the causal effects of some policy or intervention or treatment. Whether this revolution has been mostly harmless or sharp is a matter of perspective, but it has been important.
This revolution has emphasized, among other things, that when selection into treatment is not random, researchers must take great care to understand the treatment assignment mechanism in order to apply an econometric solution that yields consistent estimates of one or more causal effect parameters under plausible assumptions.
Much of this work relies on the now well-known potential outcomes framework, where for a binary treatment, D, each agent has an outcome that would be realized under each possible treatment status. These are denoted as Y(1) and Y(0). The causal effect of D for this agent is Y(1)-Y(0).
This immediately leads to what Holland (1986) called the fundamental problem of causal inference. This refers to the fact that for each agent, only one of the two potential outcomes can be observed at a specific point in time, implying that the treatment effect for any agent can never be observed. As such, any researcher that wishes to say something about the causal effect of D must make assumptions and some of these assumptions will always be untestable. Famously, this makes our job tougher than that of even physicists!
As at least some identifying assumptions in any study purporting to estimate causal effects are untestable, researchers must become very good at the art of persuasion.
This brings me to the topic at hand, the thing that irks me. As I said above, the underlying source of our problems in credibly identifying causal effects is nonrandom treatment assignment. So, whatever econometric arrow is pulled from one's quiver is designed to overcome this nonrandom selection.
How, then, does one evaluate the plausibility of the (untestable) identifying assumption invoked by the econometric analysis? Well, it stands to reason that since the problem is nonrandom assignment to treatment, then the credibility of econometric solution must rely on knowledge of this nonrandomness.
Unfortunately, I think discussion of the determinants of treatment selection often receives short shrift by researchers. And, if the researcher and/or audience does not understand the treatment assignment mechanism, then how can one evaluate the solution designed to overcome nonrandom assignment? How can we understand if the solution is credible if the source of the nonrandomness is not properly discussed?
What I frequently observe in practice is researchers merely stating that the "treatment is endogenous and here is what I do" ... without any explanation for the source of this endogeneity. But, this is first order information.
If the econometric solution is a technique that falls under the category of selection on observed variables, then the only way to convince the reader that all determinants of treatment assignment are observed is to first explain how treatment assignment is determined.
If the econometric solution is a technique that falls under the category of selection on unobserved variables, then the only way to convince the reader that a source of plausibly exogenous variation has been found is to explain what unobserved attributes affect both treatment assignment and potential outcomes.
Consider instrument variables. The two requirements for a valid instrument are (strong) correlation with treatment assignment (which is testable) and a lack of correlation between the instrument and the error term in the outcome equation (which is not testable ... even in overidentified models). One cannot evaluate the latter requirement without some understanding of the unobserved attributes that constitute the error term and are correlated with treatment assignment.
When using IV, focusing only on the testable assumption and not the untestable will surely lead one astray. First, if one focuses only on instrument strength, then the strongest instrument for an endogenous covariate, D, is D itself (which is equivalent to using Ordinary Least Squares). It's perfectly correlated! Can't get any stronger than that! But clearly if D the covariate is endogenous, then D the instrument is not valid as it will also be correlated with the error term. Well, when a researcher proposes to use some other variable, Z, as an instrument, then the only way to judge whether Z is uncorrelated with the error term is to think deeply about what is in the error term.
Second, it is not enough to simply say that "Z is random" ... like everyone's favorite instrument, rainfall.
Yes, rainfall may be random, but that does not mean it is exogenous and, hence, a valid instrument. Since researchers use rainfall to instrument for literally everything, then this implies that rainfall is correlated with literally everything ... including whatever unobserved attributes are in the error term and correlated with treatment assignment. Unless the researcher can convince the audience otherwise. And, this requires knowledge of the sources of the nonrandom treatment assignment.
Next, consider everyone's favorite econometric quiver, difference-in-differences. Now, the parallel trends assumption is required for consistent estimation and is not testable. This assumption requires agents to select into treatment only on the basis of time invariant unobserved attributes, not time-varying. And, as the wonderful Juan Moreno-Cruz reminded us via meme on Twitter, examining pre-trends is not a test of this assumption. So, to convince the audience of the identification strategy requires knowledge of the sources of the nonrandom treatment assignment.
In the end, the credibility revolution comes down to persuasion. I didn't say it, but that seems a bit like a
Thank you for indulging my little rant.
UPDATE (12.3.2022)
Just a blind squirrel finds a nut every once in a while, every so often I stumble upon something that others smarter than I have also stumbled upon. David McKenzie has two great blog posts -- one on matching and one on difference-in-differences -- making the same plea to empirical researchers that I attempt to do so here. Highly recommended reading: