Where's Waldo

My daughter started college this week; moved into her dorm last week. Fingers crossed, knock on wood, throw some salt, avoid black cats, don't walk under a ladder ... everything's been great so far and she is having an awesome time. She will definitely be the smartest one in our family in six months.  In addition to her brain becoming larger, she also has already met a ton of cool new people, including a guy on her dorm floor named Waldo. 

While my daughter now knows where Waldo is, it reminded me of a discussion on #EconTwitter from this past summer initiated by Arpit Gupta. Arpit posed a question not about Where's Waldo?, but rather about Where's the Data?. Specifically, he asked about the way that many (most?) applied researchers seem to gloss over issues of missing data when conducting empirical analyses. Either ignoring the issue completely, or, at best, relying on ad hoc and unjustified work-arounds.

With missing data, values for variables being used in the analysis…


The quote by statistician George Box 
"All models are wrong but some are useful"  
is extremely well known. It served as motivation for a prior post here. It serves as motivation for this post as well, which brings us to another famous quote, this time by a non-statistician
In addition to being a great athlete, playing the greatest of games, Yogi was a smart fella. I was reminded of the quote by Box while teaching this week. In particular, I was discussing two examples of quasi maximum likelihood estimation (QMLE). In papers, we sometimes see authors reference the estimation technique being used as MLE, while other times we see authors use the term QMLE. As applied researchers, we may not have learned the difference (or we have forgotten!) and thus we gloss over this, continuing on with our lives, our feelings of imposter syndrome growing steadily worse.   Well, it turns out that, as applied researchers, we ought to pay significant attention to that "Q." It turns out to…

It's Latin

Listening to my Spotify playlist this week, several songs by the Gipsy Kings have come up. Clearly my musical preferences have not evolved over time. One peppy song is Bamboleo. Part of the chorus is

Bamboleo, bambolea Porque mi vida, yo la prefiero vivir asi In English, this translates to (according to Google anyway)
Bamboleo, bambolea Because in my life, I prefer to live it that way

You can almost hear the music!
Well, there is something else to the song besides the uplifting beat; there are the lyrics. As the song says, individuals typically live life the way they prefer, not the way dictated from on high. Heeding this message is quite useful when it comes to ... the analysis of program or policy effects. 

What does this have to do with the empirical analysis of programs or policies? I'm glad you asked. Well, it brings to the fore the important point that just because things are dictated in a certain manner does not preclude a different reality from emerging. Agents may, after all, may…

Eye on the Prize

So much of this week was ugly and infuriating. 

The suppression of voting rights is real and just, literally, out there in the open. The pandemic has no end in sight and will likely get worse before it gets better. If it gets better. 
But, you know what else is ugly and infuriating? Empirical researchers using the wrong tool for the job. One particular example came up in a small discussion on #EconTwitter this week with my part-time antagonist, part-time protagonist. 

In this example, let's say we all agree that the true data-generating process (DGP) is
Y = a + bX + e.
How should one estimate the model?

The answer is: ¯\_(ツ)_/¯. 
We need more information. But, what information? Probably the first thing that went through your mind is that you need to know whether Cov(X,e) is zero or non-zero. If you are old school, you might also wonder about the properties of e itself. These factors might influence your choice among Ordinary Least Squares (OLS), Generalized Least Squares (GLS), and Instr…

Don't Taunt Me

In what I guess could only have been done to taunt me after my previous post on not completely dismissing an estimator, this week's Twitter decided to offer a (admittedly humorous) sucker punch at another estimator: propensity score matching (PSM).

But, I was thankful for the dig at PSM. As a new department chair, I am a bit worried about having the time and mental energy to keep up my blogging. At the start of the pandemic, I kind of sort of promised to blog every week to help keep the spirits up. Little did I know, five months later, there would still be no end in sight. 

Turns out, the dig at PSM was just the kick in the pants I needed to motivate a new post. 
PSM and matching in general has drawn the ire of many applied researchers for several years now. As with Instrumental Variables (IV) in my prior post, I am not entirely sure why. And, there is probably a whole new cohort of applied researchers who have no idea why and are too nervous to ask.

I think there are a few possible r…