A recent fracas over a psychology experiment helps illustrate how social science is broken -- and how to fix it.
In 2008, Science, one of the top scientific journals, published a paper by a group of psychologists that claimed to find biological differences between liberals and conservatives. According to the paper, conservatives tended to react more to “sudden noises” and “threatening visual images.” This result, which suggests that political liberalism and conservatism spring from deep, indelible sources rather than reactions to the issues of the day, suggests that polarization will never end -- that the populace will always be divided into two camps, separated by a gulf of biology.
On its face, that claim should sound a bit fishy. The issues dividing liberals and conservatives change -- a century ago, religious fundamentalists argued for more inflation and easy money (as a way of relieving farmers’ debts). Today, they are in a conservative coalition that generally favors hard money. Ideologies and coalitions also differ greatly from country to country -- in most nations, for example, social conservatives tend to favor big government. Yet because research seemed to say that political differences are biological, news outlets started to accept the idea as fact.
Fast forward a decade, though, and the claim is unraveling. In a working paper published this month, another team of psychologists attempted to repeat the experiment, and also conducted other similar experiments. They failed to find any evidence linking physical-threat perception with political ideology. But when they tried to publish their paper, Science desk-rejected it -- that is, the editors refused to even send the paper out for peer review, claiming that the replication study simply wasn’t noteworthy enough to be published in a top journal. Meanwhile, another team of researchers also recently tried to replicate the original study, and failed. So even though at this point the evidence proving a biological basis for liberalism and conservatism seems to have been invalidated, it’s unclear whether this fact will make it into the public conversation.
This episode -- which is part of a larger replication crisis plaguing psychology, biology, economics and other fields -- demonstrates a fundamental problem in the way modern science is done. Traditionally, scientific breakthroughs are imagined as pioneering experiments that conclusively discover important scientific truths. Young students learn a litany of such breakthroughs -- the oil drop experiment that determined the charge of the electron, the gold foil experiment that discovered that atoms have nuclei, the Michelson-Morley experiment that found that light always travels at a constant speed, and others. Classical statistics is also optimized to deal with this sort of experiment -- the most commonly used statistical tests are designed to deal with a single test of a single hypothesis.
But modern science has moved away from this model. Although early experimenters in physics and chemistry generally had to build one apparatus to test each hypothesis, modern researchers gather reams of data and run a large number of statistical tests on it. That increases the chance that the researchers will find spurious correlations, especially if they choose which tests to perform based on the results of previous tests. This problem is especially severe for fields like economics and biostatistics that rely on observational data not produced in a lab, since running test after test can be accomplished with the press of a button. But with the cost of running experiments having fallen since the days of Michelson and Morley, fields like biology and psychology now face a similar danger.
In other words, the mass production of testable hypotheses in modern science makes it very likely that large numbers of false results will gain wide acceptance in the media. In response, some researchers have suggested more stringent thresholds for statistical significance, while others have called for traditional measures of significance to be discarded entirely.
But neither of these changes would address the fundamental problem, which is that scientific journals are too focused on novel ideas. In the case described above, a paper finding a link between biology and politics was considered worthy of consideration by a top journal; a paper casting doubt on the link was not. In another recent case, some scientists found a gene called 5-HTTLPR that they claimed influenced depression, but even as failed replication efforts piled up, other scientists were getting published and winning acclaim for finding correlations between that gene and all sorts of other social outcomes. The novel findings got more attention than the skeptical follow-ups.
For science to reform itself, this needs to change. Efforts to replicate old research needs to be given just as much priority, attention and journal publication as claims of original findings. The old model of science -- a single research team discovering a new phenomenon with a single brilliant experiment -- needs to give way to the idea of science as a vast collective effort, with researchers checking and double-checking each other’s results and methods. And the media, for its part, needs to restrain its impulse to broadcast the latest hot result, and wait for a preponderance of evidence to pile up.
If changes like these are not made, scientists will find themselves facing a mounting crisis of credibility and respectability, as finding after finding turns out to have been a mirage. That in turn will make the public less responsive to science when it really matters, such as with climate change. For science to retain its air of professionalism and authority, there must be more emphasis on replication.