What Caught My Eye: Breaking the Seal on P Values
Jay Mehta, MD, MS
The Children’s Hospital of Philadelphia
When I was in medical school, there was a saying: “P = M.D.” In this case, “P” referred to a passing grade and this was usually said late at night while studying for a major test. The point was that all we needed to do was pass the test and we’d be ok in our quest to graduate from medical school. A passing grade was sufficient. There was no need to aim higher than that.
In biomedical research, the lower case “p” increasingly seems to represent a similar low bar for achievement. More and more, it seems “p<0.05 = publication = fact.” Last year, a group of researchers published an article in JAMA looking at the trend in p value reporting over the last 25 years. Through an impressive feat of text mining, they looked at over 5 million abstracts and found that from 1990 to 2014, the reporting of p values in abstracts more than doubled, that the vast majority of articles that reported p values had at least one “statistically significant” result, and that, over time, the best (most significant) p value got lower (more significant). Very few papers reported on effect sizes or confidence intervals.
The authors suggest that this finding of lower p values over time could reflect the current “publish-or-perish” environment or the increasing tendency to test more hypotheses at one time. Perhaps, the authors say, the “p<0.05” threshold has lost its ability to distinguish false from true hypotheses and that a more stringent threshold is needed.
This has ignited an interesting debate among researchers about whether the p value threshold should be lowered. A group of 72 authors from varied fields such as political science and biomedical research has proposed, in a soon to be published Nature Human Behavior paper, to “change the default P-value threshold for statistical significance for claims of new discoveries from 0.05 to 0.005.”
One of the issues prompting this proposal is that there is increasing recognition that many significant biomedical findings aren’t reproducible. A 2015 Science paper which tried to replicate 100 published psychology experiments found that the majority of replications had weaker evidence than the original publications. John Ioannidis, an author on the JAMA and Nature Human Behavior articles, published a seminal article in 2005 “Why Most Published Research Findings are False”.
While I have no interest in jumping into the debate over p values, I wholeheartedly welcome it. I have very little formal training in biostatistics (except a stats class during fellowship taught by a guy who usually looked like he just rolled out of bed and would seem surprised when he walked into the classroom and we were all there), but it’s very clear to me that many papers in reputable rheumatology journals with results that are “statistically significant” are not actually meaningfully different and certainly not clinically significant.
Ultimately, the issue isn’t just that there is rampant misuse of p values, but that we often neglect to look beyond the published data and consider how the studies were designed and if there are alternative conclusions to be made from the data. We should keep a healthy skepticism about everything we read, no matter how good a journal an article is in. In the same way we teach our trainees to quickly recognize when a joint doesn’t feel right, they should come out of training able to quickly recognize when the data seem inflamed. The flip side is that in pediatric rheumatology, often our studies have small sample sizes. As a result, we may see an effect size that is large but not significant at the p<0.05 level. We need to be careful to not automatically dismiss those data.
By the way, there’s a reason that every journal article I cited has John Ioannidis as an author. The guy is brilliant and you would be well-served by reading some of his papers on meta-research. If you don’t want to do that, here’s a great Atlantic article about him.
BE A GUEST CONTRIBUTOR!
If you would like to contribute an article (related to pediatric rheumatology research) to the What Caught our Eye column, we would love to hear from you! Please email [email protected] with details.