how social scientists think: correlation is not causation
Whenever I teach students about the difference between causation and correlation, I try to have them do something ridiculous. I might have one student repeatedly flip the light switch while having another jump up and down while another sings "I'm a Little Teapot." Then I ask, what caused the light to go on and off?
Students generally roll their eyes as they answer, "flipping the switch," but the point is clear: just because events happen at the same time doesn't mean they're in any way related. They learn one of the key principles of all sciences: correlation does not imply causation.
Social scientists spend our time trying to figure out whether phenomena are causally related - that is, whether one event or occurrence or circumstance causes another event to happen. We want to explain how and why specific events are related to one another in hopes of being able to explain more generally why similar events are related to one another.
Problem is, it's a lot harder to determine causality in the real world than it is with a ridiculous example in the classroom. It's especially difficult when there are multiple causes for an event, as is the case with violence in the eastern Congo. It's impossible to ever be 100% sure we have correctly determined the cause of an event, but we can reach a reasonable degree of certainty and have developed a number of means by which we can (hopefully) avoid confusing correlation and causation.
We do this through a couple of mechanisms. One is to isolate variables. Variables are just another way of talking about causes (which we call "independent variables") and effects (which we call "dependent variables"). Of course, most human behaviors and situations involve far more than just two variables, so we try to control for effects. This is pretty easy using statistical analysis; a social scientist using that method will use math (and, these days, sophisticated software) to control for the effects of the other variables so that she can look only at the one she thinks matters. She can then do statistical tests to determine whether she can establish a reasonable degree of certainty that the cause she has identified is indeed causing the observed effect.
With qualitative methods, it's a lot harder to establish causality, because the goal of causal inference is to determine what the effect would have looked like had an event or circumstance not happened. We call this idea of what could have been the "counterfactual." But real life doesn't usually allow us to establish counterfactuals (although the world of Randomized Control Trials is now opening up all kinds of possibilities in this regard). We can, however, look for real life counterfactuals, or places in which natural controls are in place. For example, I have an observed effect in my research which suggests that ethnicity may be an important causal variable, but I'm not certain enough about that to publish it yet. However, I have some new data from a town which is ethnically homogeneous. It's my hope that the data from this town will function as a kind of natural control, which will help me to figure this out with a higher degree of certainty.
The distinction between causation and correlation - and the obsession with making sure the two are not confused - sets quality research apart from shoddy or sloppy research. It's incredibly frustrating to me to read a hastily put-together advocacy report or journalist's account that assumes correlation means causation, despite the lack of evidence for such a claim. I understand why it happens; advocates and journalists have to work quickly, and if they talk to people who don't understand the difference, how would they know otherwise? But it's incredibly frustrating to see these errors made, especially when they lead to bad policy decisions.
Advocates, what do you think? Do most researchers in your field do a good job of distinguishing between correlation and causation? How could we better work together to make sure that the causes we're identifying are actually the causes of various events?
Labels: how social scientists think