Blog
October 3, 2025
No items found.

Promising Zone Adaptive Designs in Phase III Trials

Play video
No items found.
Promising zone adaptive sample size designs appear compelling in theory, but all simulated and practical evidence demonstrates that group sequential trials outperform these methods in terms of efficiency and power for confirmatory trials.

Adaptive Sample Size in Phase III Trial Decision-Making

Adaptive sample size strategies remain a persistent topic across Phase III, two-arm randomized trials. These pivotal settings force sponsors to balance the need for sufficient power against the risk of over-enrolling patients. The following scenario is typical, with the sample sizes being examples: a fixed sample size—300 subjects randomized one-to-one—is developed to provide 80% power to detect a mean difference in outcome (for example, a change of 4 with a standard deviation of 12). If the true treatment effect is smaller, such as a difference of 3, the required sample size for the same power grows to approximately 500. Sponsors find themselves gambling between the efficiency of a smaller trial and the safety of a larger one.

Regulatory agencies demand rigorous control of Type 1 error for Phase III confirmatory trials. Adaptive approaches, regardless of their mathematical novelty, are scrutinized for both their statistical integrity and resulting efficiency. The question is always empirical: does a given adaptation strategy reliably improve trial performance—reducing patient exposure, timeline, or cost—while delivering the robust power and error rates expected by stakeholders and regulators?

Promising Zone Design Mechanics and Practical Appeal

The promising zone design, first characterized by Mehta and Pocock (2010), utilizes a neat little trick for conditional power that allows extension of the sample size with no adjustment to the alpha level. The design is straightforward in concept: after accruing data on an initial subsample (e.g., 200 of 300 planned subjects), the interim analysis classifies results into three regions:

  • High conditional power (e.g., above 0.9): The trial is likely to succeed at the original sample size (300) and proceeds to the originally planned sample size.

  • Intermediate or “promising” conditional power (e.g., 0.4 to 0.9): Sample size may be increased up to a defined maximum (such as from 300 to 500); crucially, the original Type 1 error allocation is maintained without formal adjustment at the final analysis. The common language for this in the field is that no “alpha is spent” at the interim.

  • Low conditional power (e.g., below 0.4): Sample size cannot be increased, and the trial continues to its planned endpoint with the original enrollment target.

This design’s main attraction is tangential to trial performance: the set thresholds and action rules produce a property where, for the specified bounds, increasing the sample size does not inflate Type 1 error. Many sponsors adopt this model out of caution, concerned about traditional alpha allocation strategies and hoping to preserve as much error as possible for final testing. The fundamental question remains whether such planning genuinely produces more efficient or successful results.

Simulation Results and Analytical Insights

Repeated simulation and comparison at Berry Consultants show unclear advantages compared to fixed sample sizes – but continually worse performance to standard group sequential designs. Consider a trial planning for 300 patients: under a promising zone rule, average sample size increases to about 342 for only a 4 to 5% gain in power at the target effect (e.g., true mean difference of 4). This power gain is almost identical to a fixed sample size of 342. When the true effect drops to 3, average sample size rises by more than 50 subjects, but power gains are marginal. Notably, in scenarios entering the promising zone, more than half the trials would have been successful with the original 300 patients. The requirement that conditional power exceed a high threshold to increase sample size in the promising zone results in excess enrollment without corresponding benefit. Many trials extend the sample size that would have won, and some trials are lost at larger sample sizes that would have succeeded at the original enrollment, creating marginal efficiency benefit at best to fixed sample size designs.

In contrast, group sequential designs—executed using established spending functions such as Kim-DeMets—consistently deliver superior performance to the promising zone design. For the same baseline setting, power at the target effect increases from 82% in the fixed design to 95% with group sequential adaptation and no material increase in average sample size. For scenarios with true effects near 3, group sequential rules approach the performance of a 500-patient fixed trial, yet with average sample sizes well below maximal. These results hold under realistic constraints such as incomplete information at interim looks, as is common in trials with long endpoints, by employing Goldilocks group sequential approaches.

Group sequential rules allocate Type 1 error openly across interim and final analyses. Despite common sponsor apprehension about “spending” alpha, Berry Consultants’ simulations and operational experience have never identified a setting where promising zone methods outperform optimized group sequential trials—neither in efficiency, nor power, nor resource allocation. The thresholds (conditional power 0.4, 0.9) used in promising zone designs are determined strictly by the mathematics of error allocation not drug development goals. The issue with the promising zone design is that the mathematical trick of conditional power force the sample size decision to occur before the sample size is reached. The analogy of the tail wagging the dog is apt here.

Simulate, Compare, Iterate for Adaptive Trials

When entertaining adaptive sample sizes in confirmatory trial, don’t be swayed by the siren song on avoiding alpha allocation. Simulate the designs, compare strategies, evaluate the performance of different approaches to optimize the trial design. Promising zone constructs offer mathematical cleverness but have not delivered efficiency or improved trial designs. Planning for adaptive trials should be grounded in robust simulation, scenario testing, and design comparisons, not in the preservation of Type 1 error for its own sake or in clever but arbitrary interim rules. This clarity leads to adaptive designs that are measurably better for sponsors, regulators, and patients alike.

Download PDF
View