Blog
February 27, 2026
No items found.

Technical Realities of Ordinal Endpoint Analysis in Clinical Trials

Play video
No items found.
A rigorous review of ordinal endpoint analyses, showing every approach—utility weighting, proportional odds, dichotomization, or non-parametric—inevitably assigns relative weights to outcome states. Berry Consultants’ mathematical demonstration reveals how proportional odds analysis embeds prevalence-based weights, underscoring the need for transparency and clinical input in trial design.

Ordinal Endpoints: Structure, Rationale, and Analytical Demands

Ordinal endpoints capture the layered reality of clinical outcomes, recognizing that not all patient states fit binary categories. The Modified Rankin Scale (mRS) score illustrates this: in stroke trials, mRS 0 means no neurological symptoms; mRS 6 means death; intermediate values mark increments in disability and loss of independence. Clinicians and patients value these distinctions. Trials for COVID-19 and other intensive-care unit syndromes now use similarly granular endpoints—for example, organ support-free days—tracking nuanced transitions between states like mortality, ventilation, discharge, and recovery. Ordinal endpoints are ubiquitous.

This structure offers more information but brings statistical complexity. Analysts must determine how to meaningfully compare scores across the spectrum. In practice, their choice of method—utility weighting, proportional odds modeling, dichotomization, or broader rank-based approaches—fundamentally dictates trial conclusions.

Assigning Weights: Utility Weighting, Proportional Odds, and Hidden Model Choices

Utility weighting addresses the problem directly. Each state is assigned a quantitative value reflecting clinical or patient preferences, frequently normalized between 0 and 1. In the DAWN trial, Berry Consultants relied on independent patient-reported outcomes and economic studies to derive weights for each mRS state. The shape of the utility curve revealed the greatest decrements occur with transitions to more severe disability, with nearly identical utility assignments in both patient and cost studies. The statistical analysis then compares group mean utility using simple t-tests, leveraging the central limit theorem rather than distributional or proportionality assumptions. The only true assumption is the appropriateness of the assigned utilities.

Critics of utility weighting argue that not all patients or clinicians would select the same values for these states. Berry Consultants recognize this variation but emphasize the benefit of explicit documentation and clinical justification over hidden model-driven decisions.

Proportional odds models are frequently positioned as only relying on the ordering of the scale alone – the assumption of ordinality. However, as Lindsay Berry demonstrated algebraically—this is not the case. By re-expressing the proportional odds score statistic, Berry Consultants’ statisticians showed that the model mathematically assigns prevalence-based weights to each ordinal state in the analysis. Equal prevalence across categories leads to equal weights, while uneven distributions focus the model’s inference on shifts in more common categories. Thus, the proportional odds approach participates in the same valuation process as utility weighting, only it does so implicitly and without clinical rationale or transparency.

This means that inferences driven by proportional odds models are determined by the prevalence of each outcome not by an explicit judgment of weight. In the REMAP-CAP platform trial for COVID-19, for example, Berry Consultants used proportional odds for organ support-free days. They later calculated the weights implicitly applied by the model and found that, while the resulting valuations roughly matched clinical logic for that scenario and hence clinicians liked the results from the proportional odds model, this is neither a guarantee nor a feature designed by investigators.

Dichotomization is analytically straightforward but limiting. Collapsing all “good” or all “bad” states into binary groups not only strips the outcome’s granularity—it imposes unit weights on disparate clinical outcomes. For instance, dichotomizing states 0, 1, and 2 as a success and 3, 4, 5, and 6 as a failure, creates an equal weighting of states 3 and 6 (death), likely a weighting no patient believes. Scott Berry noted that such dichotomous assignments bear no relationship to population or individual preferences.

Practical Consequences: Model Choices, Transparency, and Clinical Integrity

The main conclusion of these analytical perspectives is clear: weighting of ordinal endpoints is not optional; it is inherent in all statistical and methodological choices. Any claim to “assumption-free” analysis is technically unsupportable. Prevalence-based weights are imposed by proportional odds models, while dichotomization and rank-based models each enforce their own relative weights, often arbitrary or undisclosed, value systems.

This reality puts a premium on transparency. Since every method imposes weights – shouldn’t their development be rigorous and transparent. Where explicit weighting is not used, the mathematical implications of the selected model should be isolated and presented for clinical review and regulatory scrutiny. In DAWN, for example, utilities were taken from patient and economic studies; in REMAP-CAP, the team shared the model-implied weighting structure with clinicians, facilitating informed acceptance (or critique) of the analytic choices.

When trialists ignore these issues, they allow default mathematical conventions to dictate the trial’s real-world meaning—at risk of disconnecting results from patient and clinician values.

Conclusion

All analyses for ordinal clinical trial endpoints embed relative weights, whether or not these are identified in advance. Every model makes a judgment—by design or by default—on the relative value of each outcome. The empirically grounded, clinically relevant strategy is to confront this reality directly: assign, document, and disclose the weights applied, and make them subject to debate and clinical relevance rather than mathematical expediency. Transparent methodology strengthens the bridge between analytic rigor and meaningful, patient-centered interpretation.

Download PDF
View