Chicken or Egg? Part II

 I am entering a new era on this blog. I am now recycling post titles. Back in 2020, I wrote a post called Chicken or Egg? 


The subject of that post was on the importance of specifying a model before specifying the estimator. The subject of this post is different, but no less important. It is a mistake I have seen countless Ph.D. students make, especially when entering the research arena for the first time. 


Conversations with said students usually go something like this:

        Student: I think I have an idea for a paper.

        Elder Researcher: Okay.

        Student: But, I'm not sure it will work.

My response to this is always the same.

        Elder Researcher: What do you mean by "work"?

And the student's answer is always the same.

        Student: I'm not sure I will find something statistically significant.

And ... gotcha.


At this point, you might think this post is about the value of null effects. To some extent that's true. But that is really secondary to the point I want to make.


Ok, ok. The point I want to make is that research must start with the right question, not with the right answer. Often students -- and sometimes seasoned researchers -- evaluate a prospective project by trying to forecast the answer, implicitly assuming that the project will be a failure unless the result turns out statistically significant.

Thinking this way is not only bad science, but leads to bad projects, not good ones. The problem is that by starting with the answer, one is led to projects that have "obvious" answers. In this case, a project can only lead to one of two outcomes: either the result is statistically significant (and in the direction expected), implying that the project is "boring", or you find a statistically (and economically) insignificant effect where one was not expected and no one believes the result. This is, what we like to call, a


proposition. 

What, then, is the solution? It's simple (in theory). Put the chicken before the egg (or is it the egg before the chicken?) and start with a research question, not a research answer. If the research question is interesting, then the answer will be interesting no matter what it is (conditional on convincing people that you have obtained a "correct" or "reasonably correct" answer). Obvious answers only belong to uninteresting research questions. If the question is interesting, and to be interesting the answer must be unknown to some extent, any answer is interesting.

There are (at least) two caveats to this. First, it is possible to arrive at an uninteresting answer even to an interesting question. However, it is not (directly) because one finds a null result, but rather because the data are uninformative about the question. In this case, one ends up with an imprecise answer that cannot rule out any number of conclusions. This is not a problem with the question, though. This is a problem with the data and/or estimation method.  


Second, sometimes a research question has an obvious general answer -- such as the marginal effect or average treatment effect is positive -- but not an obvious magnitude. In this situation, trying to uncover the magnitude of the effect is interesting. However, this isn't really a caveat since this type of research starts, presumably, with an interesting question: "What is the magnitude of ...


This characterizes the literatures on the returns to education and discrimination. To the extent that people can agree on anything these days, I think the (vast) majority believe that there is some causal effect of education on wages and there is some discrimination in the labor market. Questions of magnitude have led to a quip I heard as a youngster in this profession: "You're not a labor economist if you haven't written a paper on discrimination." It's not that discrimination is so important that everyone should work on it (although it is), but rather it seems like every labor economist has written a paper on the subject. And, in the returns to education literature, how many papers are out there debating whether the return is 6.7% versus 7.3%? 


This tangentially relates to a conversation in the Bad Place a few weeks ago by Blanchard and Rodrik. They were pumping up macro research relative to micro research because the former asks "big/important/interesting" questions, while the latter asks "comparatively uninteresting" questions. I don't know enough about to macro to know whether they ask interesting and important questions, but I do agree that a lot of micro does not. As alluded to in the conversation, the focus on uninteresting questions is, at least in part, because solid identification is required in micro by the editorial process. And, this contributes to putting the answer before the question

Peter Kennedy has made this point in his textbook as well as other places. He cites Kimball (1957) who conceptualized a Type III error that researchers can make: producing the right answer to the wrong question. This is related to the point made by Tukey (1962):


"Far better an approximate answer to the right question, which is often vague, than an exact answer to the wrong question, which can always be made precise."


Bottom line, don't overcomplicate research by worrying about the answer. Ask an interesting question, (potentially) produce interesting research. Ask an uninteresting question ... well, you are done before you even get started.



Popular posts from this blog

There is Exogeneity, and Then There is Strict Exogeneity

Faulty Logic?

Different, but the Same