Content area
Full Text
P values, the 'gold standard' of statistical validity, are not as reliable as many scientists assume.
For a brief moment in 2010, Matt Motyl was on the brink of scientific glory: he had dis- covered that extremists quite literally see the world in black and white.
The results were "plain as day", recalls Motyl, a psychology PhD student at the University of Virginia in Charlottesville. Data from a study of nearly 2,000 people seemed to show that political moderates saw shades of grey more accurately than did either left-wing or right- wing extremists. "The hypothesis was sexy," he says, "and the data provided clear support." The P value, a common index for the strength of evidence, was 0.01 - usually interpreted as 'very significant'. Publication in a high-impact journal seemed within Motyl's grasp.
But then reality intervened. Sensitive to con- troversies over reproducibility, Motyl and his adviser, Brian Nosek, decided to replicate the study. With extra data, the P value came out as 0.59 - not even close to the conventional level of significance, 0.05. The effect had disappeared, and with it, Motyl's dreams of youthful fame1.
It turned out that the problem was not in the data or in Motyl's analyses. It lay in the sur- prisingly slippery nature of the P value, which is neither as reliable nor as objective as most scientists assume. "P values are not doing their job, because they can't," says Stephen Ziliak, an economist at Roosevelt University in Chicago, Illinois, and a frequent critic of the way statis- tics are used.
For many scientists, this is especially worry- ing in light of the reproducibility concerns. In 2005, epidemiologist John Ioannidis of Stan- ford University in California suggested that most published findings are false2; since then, a string of high-profile replication problems has forced scientists to rethink how they evalu- ate results.
At the same time, statisticians are looking for better ways of thinking about data, to help scientists to avoid missing important informa- tion or acting on false alarms. "Change your statistical philosophy and all of a sudden dif- ferent things become important," says Steven Goodman, a physician and statistician at Stan- ford. "Then 'laws' handed down from God are no longer handed down from God. They're...