Why most published results are false

John Ioannidis published a very interesting paper in PLoS Biology in 2005 entitled “Why most published research findings are false.”  In it he argued that most affirmative results in biology papers that are based on a statistical significance test (e.g.  p-value less than 0.05) are probably wrong.  His argument was couched in traditional statistics language but it is really a Bayesian argument.    The paper is a wake up call that we may need to look more closely at  how we use statistics and even how we do research.

The question he asked was Given some hypothesis, what is the probability that the hypothesis is true given that an experiment confirms the result (up to some level of statistical significance)? Let P(T | Y) be the probability that the hypothesis is true given a “yes” answer by an experiment, P(T) be the prior probability that the hypothesis is true, P(Y|T) be the probability of getting a yes answer if the hypothesis is true, and P(Y) be the probability of getting a yes answer under all conditions.  Then by Bayes theorem P(T | Y) = P(Y|T)P(T)/P(Y) where P(Y)= P(Y|T)P(T)+P(Y|F)P(F), and P(F) = 1-P(T).

We can then compute P(T|Y) if we have P(T), P(Y|T) and P(Y|F). P(T) is the prior probability that is based on everything you know or don’t know. Ioannidis writes it as P(T)=R/(1+R) where R is the odds that the hypothesis is true versus it being false. In these terms, the likelihood P(Y|T) is called the power of the experiment or study and is usually written as 1-\beta, where \beta is the false negative probability or the Type II error  rate. P(Y|F) is the false positive probability or the Type I error rate and denoted by \alpha. Putting this all together gives


Often it is more convenient to consider the odds of being true versus being false: P(T|Y)/P(F|Y) = (1-\beta)R/\alpha. So the odds of a hypothesis being true given a “statistically significant” result requires that (1-\beta)R>\alpha, so  increasing power and lowering false negatives are always a good thing. But the interesting thing to me, (which is obvious in retrospect) is that even if you have infinite power, you can still get a wrong result if your false negative rate is higher than your prior odds of correctness.  This is made even worse if you have biases and Ioannidis gives typical parameter values to argue that most published papers must be false.

What was most illuminating to me is that many independent labs working on the same topic actually makes it less likely to be correct.  The reason is that if many labs are working independently than P(Y|T)=1-\beta^n (i.e. false negative rate goes down) but also P(Y|F)=1-(1-\alpha)^n (i.e. the false positive rate goes up). If many labs work on the same thing and don’t cooperate than the probability of getting a yes result goes up since the probability that everyone gets a negative result goes down. (This is the same problem you have if you do an experiment and don’t control for the number of effective hypotheses tested (for example, see here.)  Hence the odds in the multiple labs case is (1-\beta^n)R/(1-(1-\alpha)^n, which goes to R as n goes to infinity.  Thus, an infinite number of labs working on the same problem does not improve on the prior odds.   So the next time you get rejected by a high impact journal because your work is not of sufficient interest, you can take consolation in the fact that your probability of being wrong just decreased.

4 thoughts on “Why most published results are false

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s