I thank Goodman and Greenland for their interesting comments on my article . Our methods and results are practically identical. However, some of my arguments are misrepresented:
I did not claim that no study or combination of studies can ever provide convincing evidence. In the illustrative examples (Table 4), there is a wide credibility gradient (0.1 to 85) for different
research designs and settings.
I did not assume that all significant p-
values are around 0.05. Tables 13 and the respective
positive predictive value (PPV) equations can use any p-value (alpha). Nevertheless, the p 0.05 threshold is unfortunately entrenched in many scientific fields. Almost half of the positive
findings in recent observational studies have p-values of 0.010.05 ,; most positive trials and meta-
analyses also have modest p-values.
I provided equations for calculating the credibility of research findings with or without
bias. Even without any bias, PPV probably remains below 0.50 for most non-randomized, non-large-scale circumstances. Large trials and meta-analyses represent a minority of the literature.
Figure 1 shows that bias can indeed make a difference. The proposed modeling has an additional useful feature: As type I and II errors decrease, PPV(max) 1 - u/(R u), meaning that to allow a research finding to become more than 50 credible, we must first reduce bias at least below the pre-study odds of truth (u less than R). Numerous studies demonstrate the strong presence of bias across research designs: indicative reference lists appear in . We should understand bias and minimize it, not ignore it.
Hot fields: Table 3 and Figure 2 present the probability that at least one study, among several done on the same question, claims a statistically significant research finding. They are not erroneous. Fields with many furtive competing
teams may espouse significance-chasing behaviors, selectively highlighting positive results. Conversely, having many teams with transparent availability of all results and integration of data across teams leads to genuine progress. We need replication, not just discovery .
The claim by two leading Bayesian methodologists that a Bayesian approach is somewhat circular and questionable contradicts Greenland''s own writings: One misconception (of many) about Bayesian analyses is that prior distributions introduce assumptions that are more questionable than assumptions made by frequentist methods .
Empirical data on the refutation rates for various research designs agree with the estimates obtained in the proposed modeling , not with estimates ignoring bias. Additional empirical research on these fronts would be very useful.
Scientific investigation is the noblest pursuit. I think we can improve the respect of the public for researchers by showing how difficult success is. Confidence in the research enterprise is probably undermined primarily when we claim that discoveries are more certain than they really are, and then the public, scientists, and patients suffer the painful refutations.
More abstracts about the Why Most Published Research Findings Are False: Author - s Reply to Goodman and Greenland