Should I use a Bayesian trial?

This week, we (Melanie Quintana, Roger Lewis, and myself) have an article in JAMA entitled “Bayesian Analysis : Use of Prior Information in Clinical Trials”   https://jamanetwork.com/journals/jama/article-abstract/2658299

This article is part of the JAMA Guide to Statistics and Methods, a series of short articles with tutorials on several topics. The article provides a brief overview of Bayesian analysis.

We are often asked a pretty basic question “When should I use a Bayesian trial?”. While “Bayesian” formally means using Bayes theorem as a means for inference, there are many flavors.

One flavor incorporates non-informative priors and often approximates a frequentist analysis. These are generally not controversial (or at least equally controversial with all the issues a standard frequentist analysis may have). The second flavor with informative priors can create large scale controversies over interpretation of trial results. Two people with different informative priors will draw much different conclusions on the basis of the same data. This of course also frequently occurs when different consumers of the frequentist analyses will reach different conclusions and incorporate the new information differently into their clinical practices.

There is a hybrid version as well. We often design trials where the adaptive features, such as an interim sample size choice, may be driven by Bayesian analysis with informative priors, be it a prior on the treatment effect or more likely a prior on how predictive interim data is of complete data. After the trial is complete, the final analysis either uses a non-informative prior or a frequentist analysis. We use Bayesian methods in this context because they are nicely suited to predicting forward, see for example a Goldilocks trial  https://www.ncbi.nlm.nih.gov/pubmed/24697532

We design a large number of Bayesian trials each year, and the large majority are either non-informative (most) or hybrid (the next largest set). Where we use informative priors in the final analysis, these are typically motivated by previous clinical data and mutually agreed upon prospectively with federal regulators (e.g. FDA).

The largest controversy lies where an investigator supplies sufficient information through the prior distribution that would lead to a conclusion that differs from that under a non-informative prior or a frequentist approach.  In other words, two investigators with different informative priors can reach different conclusions on the basis of the same data.

Clinical trials are about dispelling uncertainty. They are usually also about allocating a limited resource (patients, time, money). If the purpose of a clinical trial is to resolve a debate in the scientific community, presumably there are scientists on both sides of that controversy. Suppose in a comparative effectiveness trial that half of the scientific community prefers drug A in some setting and half prefer B. A small trial with informative priors does little to resolve that. People with opinion A before the trial and people with opinion B before the trial will be closer after the trial, but still far apart. In fact, if the two camps are sufficiently entrenched, we might need an excessive amount of information to move someone’s opinion. Someone very confident drug B is better in advance is going to consider a trial in favor of drug A a fluke, even with p<0.025.

Several of the replies to our article have involved this potential issue (usually referring to it as a “bias”). To the extent this concern refers to an author choosing a prior sufficiently at odds with others in the field that the results won’t be believed by the community, we agree. When choosing a prior, you should consider the prior beliefs of the people you need to convince, not your own. Only then will practice be changed by the results of the trial….if you are going to use an informative prior, you must be transparent in why that prior is appropriate. That transparency extends beyond simply stating “this is my prior” but rather a discussion of where it came from, what studies inform it, what are the relevant merits of those studies, etc. Care must be taken not to cherry pick the information used to create a prior, for example by only focusing on a subgroup in a prior study and not the entirety of the data.

If a sufficient number of camps exist in the scientific community, all with differing views, the trial must have sufficient data to bring those camps together. Bayesian analysis presents no shortcuts in this situation.  I’ll leave unanswered the issue of people sufficiently irrational that no amount of data can convince them.

However, in many situations we have general agreement among the scientific community rooted in solid evidence from numerous RCTs, and the purpose of a new trial is to extend knowledge in a particular direction. In the example referenced in Melanie’s paper, that information involves use of hypothermia in infants 0-6 hours after birth. We know a lot more about Bayesian analysis than about hypothermia in infants, but our understanding is that in this situation there is not a deep divide in the scientific community. The purpose of the current article was to address a different question of what happens after 6 hours.

Being able to quantify this agreement (even to quantify it within bounds) and synthesize it with data from a new RCT can decrease the size of that trial. Would it be valuable to reconfirm the 0-6 hour result in another trial? Sure….but every subject allocated is a choice on where to place resources. Is it best to allocate infants to reconfirming the 0-6 hour result, extending that result past 6 hours, or perhaps to another question entirely. Running a smaller focused trial allows the subjects saved to enroll in another trial with a different therapy.

As noted in the article, even seemingly vague prior information is worth a fairly large number of subjects. Suppose you have an antibiotic and you are placing a prior distribution on the cure rate in a control arm, an arm that has been used in many clinical trials and in practice. It’s not unreasonable to be virtually sure the cure rate is above 50% for sensitive pathogens. How many subjects is that worth? If you think the real rate is near 75% and you have, say, 99% prior probability the rate is above 50%, that is worth about 20 subjects of information. If one was given a few clinical trials and practical data that resulted in 95% prior probability the rate was between 70% and 90%, that is worth about 60 subjects. With multiple clinical trials, a proper analysis can be worth a few hundred patients worth of information. (For the technical, I’m just looking at the effective sample size of Beta distributions that might obtain those characteristics for the earlier examples, and a more formal borrowing of control information for the latter example).

What we definitely would like to see society avoid is excessive confirmation of the known when that is an implicit choice not to investigate the unknown under the veil of a fear of “bias”. We’ve had a sequence of failed Alzheimer’s trials. Suppose there have been 20 clinical trials, each with 3000 patients (1500 per arm).  That’s 60,000 patients, with 30,000 of them allocated to the control arm. That doesn’t make sense. Instead of allocating 30,000 patients to keep reconfirming how the control arm behaves , why not allocate 15,000, use informative priors for the control arms given the amount of current data available, and be able to investigate 10 more possible therapies? The question isn’t one of what is ideal, but rather is in understanding the opportunity cost of using these patients to reconfirm the well understood.

We fully understand the objection that entry criteria, etc., can differ between trials and you can get differences, but modern methods can incorporate that (if the trials are sufficiently different models recognize that and create less informative priors). We are even more concerned about priors based on cherry picked subsets of prior trials. As with any other aspect of a study, you need to be transparent about where the prior came from.

An alternative to that is utilize platform trials, which provide more structure and randomizations to the process, see:

Woodcock and LaVange:  http://www.nejm.org/doi/full/10.1056/NEJMra1510062#t=article

Berry, Connor and Lewis:  https://jamanetwork.com/journals/jama/article-abstract/2210902

Saville and Berry:  https://www.ncbi.nlm.nih.gov/pubmed/26908536

These all contain a structure tailor made to developing prior information and incorporating it for use with future subjects.

Back to original question…. “when should you run a Bayesian trial?”. The answer to this certainly is not “always” or “never”. In terms of informative priors, the answer is still mixed. Using an informative prior will only provide efficiency (smaller trials, etc.) when the prior is generally accepted by the scientific community, which requires complete transparency.

As a final note, we always recommend quantifying the pros and cons of any method, including an informative prior. For any situation you can quantify the benefits of using an informative prior, typically a decrease in variance, as opposed to potential risks, such as bias. Each situation is different, and in some cases the benefits may outweigh the risks, and in some cases the risks may be greater. In a rare disease with limited data available, often the variance reduction may greater outweigh any potential bias. In other situations, the risk for bias may be larger.