Blog
January 2, 2026
No items found.

The Rumored Shift to a One-Trial Standard for FDA Substantial Evidence

Play video
No items found.
In recent public discussion, FDA leaders have indicated a possible shift from the longstanding two-trial requirement for substantial evidence of drug efficacy to acceptance of a single, highly stringent trial. Dr. Scott Berry and Dr. Kert Viele, on the "In the Interim..." podcast, analyze the statistical, regulatory, and scientific implications and highlight those in this blog.

Historical Context and Foundations of the Substantial Evidence Standard

The 1962 amendments to the Federal Food, Drug, and Cosmetic Act introduced a statutory requirement for "substantial evidence" of effectiveness for US drug approval. Prior to this, efficacy data were not required for entry into the market. The term “substantial evidence” lacked fixed definition in law, leading to evolving interpretation via regulatory guidance and professional consensus.

FDA’s approach, informed by reference to 21 CFR 314.126 and decades of agency practice, ultimately solidified an expectation—rather than a strict regulatory mandate—that two independent, adequate, and well-controlled clinical investigations are necessary for substantial evidence. Over time, this has become the de facto standard. Most commonly, this meant two Phase III trials, each required to detect efficacy at a conventional p-value threshold (two-sided alpha 0.05, or more specifically, one-sided 0.025 in the context of superiority testing).

This two-trial expectation has provided a strong bar, minimizing the risk of false efficacy claims and rarely resulting in the subsequent removal of approved drugs for lack of effectiveness. Withdrawals, when they occur, are almost always related to safety, not efficacy. The process has worked well from both a scientific and public health standpoint. The key concern for both regulators and statisticians has been to maintain this high evidentiary bar, as ineffective drugs on the market can harm patients, distort further research, and impede scientific progress.

Statistical Implications of Moving from Two Trials to One

The rumored FDA shift, discussed in recent months and highlighted by public statements from senior leadership, raises critical questions about scientific design and statistical integrity. The potential move to a single-trial model is not simply a matter of reducing effort; it is a fundamental alteration in how evidence is quantified and interpreted.

As explained by Dr. Scott Berry and Dr. Kert Viele, requiring two independent trials with p-values each below 0.025, might be replaced with requiring in a single trial a type I error rate of 0.000625 (0.025 × 0.025). Requiring this level of evidence using a single trial would require a much stricter significance threshold—again, p < 0.000625. If this higher bar is used, then the overall statistical protection against false positives is mathematically equivalent. Interestingly, this adjustment allows for an approximate 20% reduction in sample size for equivalent program powering compared to two trials—if and only if the stricter p-value cutoff is used and both statistical and scientific standards are maintained.

Scott Berry emphasizes that arbitrarily dividing a development program into “red” and “blue” trials—identical in design, sites, and populations—is less scientifically informative than pooling all data and analyzing as a totality. Combining the two trials can result in better statistical inferences and more efficient allocation of patient resources. If the new one trial requirement maintains the usual nominal 0.025 level, this lowering of the bar  introduces real risks.

Breadth, Generalizability, and Practical Trade-Offs

A significant trade-off of the single trial rule, as noted throughout the podcast episode, is clinical breadth. Two independent trials can provide evidence across multiple populations, varying concomitant medications, differing geographies, or subgroups with distinct potential responses. In diseases with heterogeneity or varied standards of care, this breadth is crucial for supporting wider clinical claims and ensuring a robust, generalizable label.

The single-trial model may force sponsors to design trials with broader inclusion criteria to recover some of this generalizability. However, actual success in capturing clinical diversity depends on intentional design from the outset. One narrow trial—no matter how statistically powered—cannot substitute for wide clinical applicability unless those elements are embedded into the protocol.

Berry and Viele note the role of regulatory labeling: drug approvals tied to narrowly defined populations will restrict indicated uses, potentially placing the burden on clinicians to guess at broader uses – or the need for sponsors to conduct “label expansion” trials for broader populations after initial approval. From a development strategy perspective, the change could accelerate time to market for some therapies but place more emphasis on post-approval research and adaptive, multi-cohort (basket) trial architectures.

Bayesian Approaches, Analytical Future, and Real-World Consequences

Beyond frequentist constructs, Berry and Viele discuss the relevance of Bayesian frameworks. Bayesian methods, by formally incorporating prior data and explicitly modeling the probability of benefit, offer a structured way to weigh totality of evidence. However, the current discussion around FDA guidance remains anchored in frequentist error rates. Broad adoption of Bayesian analysis, while methodologically robust, is not anticipated in the immediate regulatory changes—but it doesn’t stop Berry and Viele from imagining Bayesian thinking as a future evolution.

Additionally, Scott Berry and Kert Viele raise questions about varying standards by therapeutic context. The optimal p-value threshold, they argue, should consider risk/benefit based on factors like disease severity, unmet medical need, and downstream patient impact. For fatal, rare diseases, a looser evidence bar may sometimes be justified, while for high-prevalence, lower-risk conditions, standards should be stricter. While they believe varying standards is good science – the application at the regulatory level will be challenging.

Conclusion

The rumored move to a single-trial standard for demonstrating substantial evidence—if defined correctly with an appropriately stringent p-value threshold—can be both efficient and scientifically rigorous. As Scott Berry and Kert Viele caution, any weakening of the statistical requirements would diminish the public health protections embedded in current practice. The one-trial rule is a scientifically defensible approach, requiring equivalently high evidence and may be more efficient, but may risk less breadth of exploration and relevance needed for clinical decision-making. It’s exciting to think future scientific evolutions may incorporate Bayesian decision making, but for now, maintaining high standards is expected for the new rules.

Download PDF
View