The Role of Biomarkers in Clinical Trials – and What We Learned from Baseball

For over a century, teams selected baseball players based on subjective evaluations of their build, looks, on-field performance, and reputation.

Then, mathematicians got involved. Specifically, statisticians used advanced statistical analyses to identify undervalued players who had great potential. This geeky approach to baseball is known as Moneyball and was popularized by the Oakland A’s, a small-market team with a limited budget, which resulted in spectacular success. The name comes from the 2003 book Moneyball: The Art of Winning an Unfair Game by Michael Lewis. 

Given the success of this approach, we asked ourselves: “Can a similar approach improve decision-making in drug development?”

We explored this question in our Moneyball in Drug Development webinar, sponsored by Intelligencia AI. We focused on identifying opportunities created by trade-offs and leveraging data insights to help drug developers make better decisions in clinical trial design. You can tune into the full webinar here. In this blog, we provide a summarized version with the core highlights that would make it on the drug development version of SportsCenter. 

The Question: When Should Biomarker Data Be Used as a Criterion in Clinical Trials? 

The study we present tries to answer an important question, namely, when to use a biomarker to select the patients who are most likely to respond to a drug, a process called enriching the clinical trials. The challenge is to find the best course of action given the trade-offs enriching requires. Imagine planning a clinical trial and deciding between:

  1. Enriching a trial which means increasing its probability of technical and regulatory success (PTRS), but potentially limiting the population of patients for which the drug will be approved or, 
  2. Not enriching a trial, which means addressing a larger patient population, but accepting a lower PTRS and with that increasing the chance that the trial fails, leaving a potentially effective treatment unapproved.

The specific questions the analysis set out to answer are how the PTRS changes by indication if the trial is enriched with biomarkers and how the fact that the target has already been validated (a drug against this target has already been approved) impacts the PTRS in a biomarker-enriched trial.

Statistical Concepts in Brief

Cutting-edge statistical models and concepts underlie this analysis and work. The two most important approaches are (1) causal inference and (2) hierarchical Bayesian modeling.

What is Causal Inference? 

Causal inference is an approach used to remove statistical bias by identifying true causal relationships between variables, meaning that changes in one variable directly cause changes in the other. Unlike traditional statistical inference, which identifies correlations between variables, causal inference looks to elicit and utilize true cause-and-effect relationships to make unbiased estimates. Traditional methods can indicate that two variables are related, but can’t confirm that one causes the other. However, causal inference uses specialized methodologies to distinguish between true causal relationships and mere correlations.

What is Hierarchical Bayesian Modeling?

Hierarchical Bayesian modeling is the preferred approach for updating the probability of a hypothesis as new evidence becomes available. In general, these models are particularly useful in dealing with uncertainty and combining diverse sources of information; they are used frequently to inform decision-making when uncertainty is a factor, such as risk assessment and inference, i.e., understanding relationships in data. 

The specific model employed in our analysis is called a hierarchical Bayesian logistic regression model; it combines dataset-wide patterns with subgroup-specific variations. This model is ideal for analyzing grouped or nested data, e.g., clinical trials data grouped by indications (e.g., cancer types), and excels at estimating the differences between groups while sharing information across groups. This model allowed us to accurately estimate how biomarkers affect trial outcomes while accounting for global and group-level properties.

Modelling alone cannot yield valuable results without good quality data. Robust data is necessary for the analysis to produce novel results. For this analysis, we used the Intelligencia AI database, which contains every piece of publicly available data in oncology published since 2000, including data on drugs and clinical trials from sources like clinicaltrials.gov, scientific papers, conference presentations, etc.

What’s the Impact of Biomarker Selection on POS?

The first question we tried to answer was what role the indication plays in the probability of success (POS) in biomarker-enriched trials. Figure 1 shows the results: the vertical axis lists the indications in the data set, and the horizontal axis indicates the number of percentage points that a phase 2 trial POS increased on average if we enriched it with a biomarker. The big winners are prostate cancer and leukemia, which see an increase of nine and eleven points of POS, respectively. Phase 2 has an average success rate of about 28%; therefore, with biomarker enrichment providing an average boost of 2-3%, the average POS for Ph2 trials with biomarkers comes to approximately 30%.

Impact of Biomarker on phase 2 clinical trials for different indications in oncology

The boost from biomarkers in phase 3 clinical trials is also significant, with prostate, ovarian and pancreatic cancer experiencing the biggest POS gains.

Based on this data, we developed an Opportunity Matrix below that plots the percentage of trials that use biomarkers (propensity) on the horizontal axis against the percentage points by which biomarkers increase phase two POS. The upper left quadrant, called New Opportunities, shows indications that could benefit from biomarkers but where biomarkers are rarely used.

Opportunity matrix plotting biomarker propensity against phase 2 POS increase in biomarker-enriched clinical trials.

The second hypothesis we explored assessed how well biomarkers work at boosting phase 2 POS if the target is validated. We defined a validated target as one against which a drug has already been approved.

Interestingly, we observed a stark bifurcation: non-validated targets see most of the improvements from biomarker enrichment, while validated targets get hardly any improvements at all. This is because validated targets often succeed without needing enrichment from biomarkers, whereas unvalidated targets may rely more on biomarkers to demonstrate effectiveness.

More novel insights might result from continued in-depth analysis of the data. We plan on exploring the impact of trial size and basket/umbrella trials on the probability of success.

Conclusion

The work and findings from our analysis aim to help drug developers make more informed decisions when designing clinical trials, considering trade-offs, and assisting them with leveraging relationships discovered in the data. 

The casual inference approach paired with hierarchical Bayesian logistic regression models can lead to meaningful improvements in drug development, similar to how statistical analysis transformed baseball.

Our work also highlights the importance of collaboration between large pharmaceutical companies and novel data and AI solution partners. Intelligencia AI’s data and our combined expertise at GSK allow us to address challenges in drug development that the pharmaceutical industry has struggled with for decades.

If this summary piqued your interest, please watch the webinar and contact us with further questions.

About the Author

Gary Summers, Director, Oncology Portfolio Decision Sciences, GSK


Gary Summers is a Director of Oncology Portfolio Decision Sciences at GSK, where he applies decision analysis to drug development. Complementing this work, Dr. Summers studies how uncertainty impacts decision-making. His paper, “Friction and Decision Rules in Portfolio Decision Analysis,” won the 2021 Clemen-Kleinmuntz Best Paper Award. His computer game design that analyzes decision-making received US patents 7,349,838, 6,408,263 and 6,236,955. Dr. Summers has an MS and PhD in Management Science from Northwestern University. Connect with Gary on LinkedIn.