The Center for Data Innovation spoke with Iya Khalil, co-founder of GNS Healthcare, a health data analytics company based in Cambridge, Massachusetts. Khalil discussed the future of AI in healthcare and the value of causal machine learning.
This interview has been lightly edited.
Joshua New: You founded GNS Healthcare in 2000, well before the healthcare sector seemingly became so focused on data-driven healthcare and precision medicine in the last decade or so. How has your work, and the field more broadly, changed since then?
Iya Khalil: When Colin Hill and I founded GNS Healthcare in 2000, there was no one talking about artificial intelligence or how it might be leveraged to drive precision medicine. Electronic health records were not widely used and supercomputers were slow and clunky. But it was also a time when the human genome was being mapped, which really sparked the idea for GNS. How could we leverage such granular genetic data to start to unravel disease and human biology?
There have been a lot of changes in the industry since we first began. The increasing volume and variety of available data is unprecedented. The world is creating two and a half million terabytes of data every day—and nearly 30 percent of that is being generated by the healthcare industry thanks to the explosion of EHR, digital imaging, natural language processing, genetic data and connected medical devices. We are on the precipice of making precision medicine a reality.
The power and potential of AI has come a long way, and it is being recognized as a technology that will have real-world impact. The U.S. Food and Drug Administration (FDA) offered its vote of confidence by encouraging the use of AI and other digital tools in medicine and drug development in the 21st Century Cures Act signed into law in 2016. It was designed in part to accelerate drug development and includes the expansion of drug labels through the use of analytics and AI to generate real world evidence from observational data, without a new clinical trial.
GNS uses a unique and incredibly powerful type of AI called causal machine learning. It models complex human disease in computers, and then runs simulations to identify which interventions cause which outcomes. And perhaps most importantly, it answers the underlying questions of why. We work with biopharma companies, health plans and foundations to drive precision medicine by matching the right treatment to the right patient at the right time.
New: The utility of AI in healthcare has increased dramatically in recent years. How do you see this trend progressing in the future? Is there an upper limit on the kinds of work you think AI can do in this space?
Khalil: It’s important to understand that AI covers a spectrum of technology, from the very simple, like statistical predictive analytics and image recognition, to the more complex, like deep learning and causal machine learning.
Causal machine learning is an extremely powerful type of AI. It doesn’t just find patterns in data, which is what a lot of traditional methods like deep learning do, but it actually uses the data as fuel to reconstruct the underlying mechanisms of the system that created the data in the first place. Once these mechanisms are identified, we can ask “what if?” questions, like what if one drug was used versus another or which is the best intervention for this patient.
This type of technology is crucial to matching the right treatment to the right patient at the right time versus treating patients as if they were some hypothetical “average patient.” None of us are average. The goal is to cure disease, slow progression and save billions of dollars in unnecessary healthcare spending.
Let me share a couple of recent results from our causal machine learning platform. We recently published the results of a joint effort with the Multiple Myeloma Research Foundation where we discovered a biomarker that identifies which multiple myeloma patients are likely to benefit from stem cell transplantation. Stem cell transplants are painful and expensive, so understanding who will respond is game-changing.
We have also been working with the Alliance for Clinical Trials in Oncology, a clinical trials network sponsored by the National Cancer Institute, and were able to discover the role that tumor location plays as a driver of overall survival in patients with metastatic colorectal cancer. Clinicians can use this insight to choose the right treatment earlier in the disease.
I think the potential of AI is limitless. Human disease and biology is complex but causal machine learning is allowing us to understand it in ways we haven’t before. We now have the ability to discover more insights about disease, unravel human biology and make a huge impact on the health of patients- and we are only getting started.
New: Can you explain what GNS’ REFS platform is? How does this differ from decision support systems for doctors such as IBM’s Watson?
Khalil: The true differentiation of the REFS causal machine learning platform is its ability to take massive, diverse data sets and turn them into interactive models that explain the cause and effect relationships between data without bias. Based on the award-winning mathematics of Judea Pearl, REFS creates transparent machine learning models that explain the “why” behind an outcome, and let users ask the models questions about the effects of future actions. It is data agnostic, meaning that it can use just about any type of data to create models, including genetic, genomic, proteomic, electronic health record, claims, consumer, lab, prescription, mobile health, sociodemographic, and more.
For instance—REFS could build a model that allows a biopharma company or a physician to understand what the difference in health outcomes would be for a patient if they were given a first line treatment, a second line treatment, or some sort of combination of treatments.
IBM Watson mines the available known knowledge—things that have already been discovered, studied, and published in literature or in other records, to create predictive recommendations. REFS actually begins to automate the scientific method by reverse engineering how the system, biological or otherwise, works and discovers novel insights or “unknown unknowns.” This is critical in healthcare because biology and disease are extremely complex and there are so many questions still unanswered and new discoveries being made daily. We need to continue to move forward in discovering novel insights to drive us closer to precision medicine.
New: GNS Healthcare offers solutions targeting the discovery of biomarkers and subpopulations that respond to specific drugs. What about your technology lends itself so well these types of insights?
Khalil: More than eight out of ten drug clinical trials end in failure. Failure in the clinical trial process can be the result of simply not having the right drug to mitigate a disease. However, unsuccessful trial results more often stem from failing to properly identify the patients who will benefit from the drug. Traditionally, most clinical trials are designed to simply determine which patients respond and which do not respond to a certain drug. But trials need to go beyond simply testing for response. They need to be designed in a way that enables researchers to learn who is likely to respond and, more importantly, why. In other words, to understand the biological and physiological mechanisms driving response.
If we can set-up phase 1 and 2 studies as learning trials by collecting sufficient genomic, proteomic, other molecular and granular surrogate digital health readouts, then we can apply causal machine learning to identify the mechanisms driving response. Understanding the mechanisms means that researchers can select patients accordingly and design more successful phase 3 trials.
It’s a combination of collecting the right data with cause and effect models that go beyond just predicting likely response to treatment by determining why a specific population group is responding. Researchers need to not only identify response rates, but to better understand them as well. This means shifting from a prediction model to a cause and effect model to determine why a specific population group is responding. Unfortunately, with traditional statistical methods, identifying relevant biomarkers is likely to lead to many false positives, increasing the time and expense for discovering the subpopulation that is likely to respond.
Our platform eliminates the time-consuming current methods of biomarker selection, allowing scientists to explore all potential biomarkers and select the most relevant causal biomarkers for a trial. Users can run a complete trial dataset and quickly identify predictive and causal biomarkers to select patients with a high probability of treatment benefit in a follow-up trial. It also prioritizes prognostic biomarkers that could identify disease drivers that, when perturbed, show a change in the outcome of interest.
New: GNS Healthcare claims to be able to run in silico clinical trials that show how drugs work in the real world. What kind of data goes into this prediction?
Khalil: Biopharma is looking to make clinical trials much more flexible and adaptive. This is in part due to a few changes made at the federal level that allows biopharma companies to incorporate real-world data in ways that were previously impossible. AI and machine learning can add real value when bringing a drug to market by conducting in silico trials—or in other words, trials that are conducted completely within the confines of a computer.
But in order to do this effectively and in a way that stands up to the scientific rigor that is expected of traditional clinical trials, biopharma needs an AI platform that helps explain the cause and effect relationships between the data and provides transparency into what is happening and why.
Our platform, REFS, can leverage a near infinite amount of clinical trial data and real-world data to create in silico trials that enable researchers to explore disease and drug mechanisms, identify subpopulations of patients through biomarker discovery, and evaluate treatment effects and responses. The way the causal models are built allows for a greater number and wider variety of experiments to be conducted, including head-to-head comparison studies, how a drug will perform in an out-of-sample patient cohort, and expand to possible new drug indications. By adopting this approach, biopharma can incorporate new patient data and learn continuously to adjust the trial, improving the probability of success while reducing the total cost of the drug’s development.