The Validity of External Validity

Admittedly, I don't always English so good.

Image result for gifs bad grammar

However, usage of the terms "internal validity" and "external validity" in the causal inference literature has baffled me a bit. So, I did some investigating.

We all know what these terms mean. So much so they typically don't merit more than a passing definition.

Image result for gifs dont mention it

But, I think these terms do require a bit more thought in one aspect.

All of us ensconced in the causal inference world presumably have heard, been taught, or even repeated the mantra: "Randomized Control Trials (RCTs) have greater internal validity, but observational studies have greater external validity."

Is this really true?

Image result for gifs well im waiting

It depends on how we define internal and external validity.

Image result for gif get used to disappointment

To my knowledge (h/t Manski), the distinction between the two concepts dates back to Campbell & Stanley (1963, p. 5) who write:

"Fundamental to this listing is a distinction between internal validity and external validity. Internal validity is the basic minimum without which any experiment is uninterpretable: Did in fact the experimental treatments make a difference in this specific experimental instance? External validity asks the question of generalizability: To what populations, settings, treatment variables, and measurement variables can this effect be generalized? Both types of criteria are obviously important, even though they are frequently at odds in that features increasing one may jeopardize the other. While internal validity is the sine qua non, and while the question of external validity, like the question of inductive inference, is never completely answerable, the selection of designs strong in both types of validity is obviously our ideal." (emphasis in original)

Fine. This sounds consistent with what we think and say. Internal validity refers to the ability to say something about the causal effect of a treatment in the sample at hand. External validity refers to the ability to say something about the causal effect of a treatment in some other population.

Image result for gifs is this going somewhere

Perhaps.

RCTs - with perfect compliance and no missing data - make it easy to obtain unbiased estimates of the average causal effect of treatments in experimental samples.  However, RCTs typically have difficulty generalizing outside the experimental sample for many reasons that I don't care to dive into here. Thus, internal validity is straightforward, but external validity is not.

Observational studies typically have difficulty obtaining unbiased (or consistent) (point) estimates of average causal effects of treatments. However, samples in observational studies are often representative of a wider population, making generalizing any findings a bit easier. Thus, internal validity is not straightforward, but external validity is.

This logic underpins the mantra above that observational studies enjoy greater external validity. It is encapsulated in statement like the following in Roe & Just (2009, p. 1267):

"Comparing ends of the methodology spectrum, there is a clear tradeoff between internal and external validity: laboratory experiments provide greater internal validity than field data, while field data provide greater ecological validity and we argue a lower burden for establishing external validity."

But is this correct? What about that Latin part in quote above by Campbell & Stanley (1963), "sine quo non"?

Image result for gifs fancy words

According to the wise sages at Wikipedia, sine quo non is a Latin legal term for "[a condition] without which it could not be." As such, I read Campbell & Stanley as saying that you cannot have external validity without internal validity


Image result for gifs booyah

This follows from the inclusion of "validity" in the terms. If the results from a study - experimental or observational - do not yield an unbiased (or consistent) estimate of the causal effect(s) of a treatment, then it is not internally valid, and hence cannot be used to make externally valid statements. Extrapolating based on an internally invalid estimate goes beyond what Manski calls wishful extrapolation (e.g., Manski 2018). It is gibberish built upon nonsense.

Related image

If internal validity is, in fact, a sine quo non for external validity, then observational studies cannot enjoy greater external validity if they enjoy less internal validity. Observational studies, at best, enjoy greater external validity conditional on being internally valid.

Others have made this argument, but it is not something I have seen much discussed. For example, Carlson &  Morrison (2009, p. 81) write:

"External validity is the ability to generalize study results to a more universal population. Inferences about cause–effect associations from a specific study are considered externally valid if they may be generalized from the unique and idiosyncratic settings, procedures and participants of the study, to other populations and conditions. External validity is the degree to which the conclusions in a study would hold for other persons in other places and at other times. As such, internal validity is a prerequisite for external validity." (emphasis added)

The counterpoint to this is that validity - internal or external - is not a binary event. We should not be so arrogant to ever think we are absolutely certain of anything (of that, I am sure ... you see what I did there?). In this case, it could be that a result based on an observational study is less convincing, and therefore enjoys lower (but not zero) internal validity, but there is no issue of generalization since the sample represents the entire population of interest. Thus, in a probablistic sense it enjoys greater external validity than a corresponding RCT based on a non-random sample of the full population. However, such claims of greater external validity of an observational study result are clearly dependent on the extent of internal validity.

Is this a lot of something about nothing?


Image result for gifs you be the judge

References

Campbell, D. and J. Stanley (1963), Experimental and Quasi-Experimental Designs for Research, Chicago: Rand McNally.

Carlson, Melissa D.A. and R. Sean Morrison (2009), "Study Design, Precision, and Validity in Observational Studies," Journal of Palliative Medicine, 12, 77-82.

Manski, C.F. (2018), "Reasonable Patient Care under Uncertainty," Health Economics, 27, 1397-1421


Roe, B.E. and D.R. Just (2009), "Internal and External Validity in Economics Research: Tradeoffs between Experiments, Field Experiments, Natural Experiments, and Field Data," American Journal of Agricultural Economics, 91, 1266-1271.


Popular posts from this blog

There is Exogeneity, and Then There is Strict Exogeneity

Different, but the Same

Chicken or Egg? Part II