p-Values – think again

The American Statistical Association (ASA) has released a strong and clear statement on the proper use and interpretation of the p-value. 

This is a timely and important announcement because I regularly read and review scientific research articles that rely heavily on the p-value to support the authors’ hypotheses as evidence that ‘this must be right because p<0.05…’

“The p-value was never intended to be a substitute for scientific reasoning,” said Ron Wasserstein, the ASA’s executive director. “Well-reasoned statistical arguments contain much more than the value of a single number and whether that number exceeds an arbitrary threshold.”

This is the way it is being used though, for sure.

“Over time it appears the p-value has become a gatekeeper for whether work is publishable, at least in some fields,” said Jessica Utts, ASA president. “This apparent editorial bias leads to the ‘file-drawer effect,’ in which research with statistically significant outcomes are much more likely to get published, while other work that might well be just as important scientifically is never seen in print. It also leads to practices called by such names as ‘p-hacking’ and ‘data dredging’ that emphasize the search for small p-values over other statistical and scientific reasoning.”

Absolutely. This is the problem we now face. If we want to clarify the role of the p-value in our research, we need to educate researchers in the art of scientific reasoning and inference using quantitative methods – submitting a manuscript that doesn’t make a big deal of the p-value in support of the major claims of a research finding is a big gamble – and why would we take that? We tick all the boxes to please the reviewers, right? We’re academics after all! This is why the ASA statement is so important. It’s something that can be used to justify the limited use of the p-value metric in an article, and also a rebuttal reference that can be used when peer reviewing to give a polite reminder that “hey, there are other ways to make your claims stronger and p-values ain’t the best way”

The statement’s six principles, many of which address misconceptions and misuse of the p-value, are the following:

  1. P-values can indicate how incompatible the data are with a specified statistical model.
  2. P-values do not measure the probability that the studied hypothesis is true, or the probability that the data were produced by random chance alone (this is used a lot in data science papers!)
  3. Scientific conclusions and business or policy decisions should not be based only on whether a p-value passes a specific threshold.
  4. Proper inference requires full reporting and transparency.
  5. A p-value, or statistical significance, does not measure the size of an effect or the importance of a result.
  6. By itself, a p-value does not provide a good measure of evidence regarding a model or hypothesis (yet this is regularly used to support such a claim).

It is further suggested that researchers should “emphasize estimation over testing such as confidence, credibility, or prediction intervals; Bayesian methods; alternative measures of evidence such as likelihood ratios or Bayes factors; and other approaches such as decision-theoretic modeling and false discovery rates.”
The ASA statement is signed off with the following remark, and let’s hope this reaches the masses…

“What we hope will follow is a broad discussion across the scientific community that leads to a more nuanced approach to interpreting, communicating, and using the results of statistical methods in research.”

This is not a new problem, or a new debate. But the ASA saying it out loud will hopefully make people listen up!