The Center for Data Innovation spoke with Nigam Shah, associate professor of medicine (biomedical informatics) and assistant director of the Center for Biomedical Informatics Research at Stanford University. Shah discussed when interpretability is and is not important for algorithms and how data can advance the care of patients.
This interview has been edited.
Michael McLaughlin: Many people talk about how opening up the “black box” of some algorithms is particularly important in medicine. But you have also posed the question “Is interpretability overrated?” Do you agree with people who talk about why opening up the black box is important in medicine?
Nigam Shah: I’d say the debates tend to be about the “why” and “how” of opening the black-box algorithms. In my view, the question we should be asking is about when. For example, when is a black-box model okay to use, and when is an interpretable model required? The general principle I use is that if the underlying anatomy of the model is going to inform what actions I take then the model has to be interpretable. If the action I will take is de-coupled from the guts of the model, then a black-box model is okay as long as it can be proven that its predictions are correct. Weather prediction offers a great analogy. Do you care how the prediction of rain is made as long as it is correct enough to base the decision of whether to carry an umbrella or not? The engineer building model certainly needs to know how the model made the prediction it did, but as a user, I only care if it is correct; and if it is, I can use it to guide a decision to carry an umbrella.
So yes, there are definitely situations where opening up a black box model is important; but at the same time, there are many situations where a black-box model is useful if we keep a human in the loop to take actions.
McLaughlin: You’ve done research showing that social media can provide early clues that patients are having adverse reactions to drugs. Can you explain this research? What are the implications for how we evaluate drug safety?
Shah: This work, done in partnership with the team at Inspire, a social network that connects caregivers and patients, and colleagues in our dermatology department, is about listening to the patient’s voice. I would not view this as a drug safety effort. The core idea in the work is to analyze the anonymized content of what the patient experience is and look for drug and symptom mentions that are co-occurring more than we would expect given the rate at which just the drug and just the symptom get talked about. Taking such a patient-view allows us to bridge medical disciplines. For example, in the work you mention, the side-effect we picked up was for a cancer drug. If the side effect is skin related, the information goes to the dermatologist; the oncologist never hears about the issue even though it is the same patient being cared for by two specialists.
So I view the value of analyzing patient-generated content as a means to get a view into the lived experience of the patient, which is not drug-centric, or disease-centric. It is inherently personal and human-centric.
McLaughlin: Can you explain your idea for a “green button” that helps clinicians find patients with similar cases by searching large datasets?
Shah: The idea is to use all the information at our disposal to make a better care decision. When faced with an uncertain course of action, clinicians typically tap their colleagues seeking their opinion. Why not seek an opinion from the collective experience of all their colleagues? That is the core idea. We want to learn from every past patient to make the care of the next person better.
McLaughlin: What kind of obstacles make the implementation of this difficult?
Shah: There are lots of challenges. You might be surprised that the idea of learning from a library of similar patients dates back to the 1970s. As the decades went by, the lack of data in electronic form, availability of suitable computing power, and needed advances in statistical learning all got solved in a nice convergence. Currently, I’d say the biggest obstacle is making such learning a priority by having the right incentives so that learning from past patients is not a ‘nice to have’ but a ‘must have.’
As a patient, would you not want your care to be informed by the collective experience of past patients? Would you not want your de-identified data to be used to benefit the care of others? Imagine if just like we donate blood and donate organs, we started donating data. I realize that given the current social milieu, we might not want the tech companies to further milk our data, but this is where universities can play a pivotal role.
McLaughlin: You developed a tool that can predict the likelihood a patient is dying within 12 months. What kind of factors does this consider? How is this useful in a healthcare context?
Shah: A tool is a means to an end and the end goal is to improve the quality of care. Experts in palliative care can help a patient the sooner they see the patient. However, the median time to referral to their specialty remains agonizingly close to the end. We wait too long before seeking their help. So the fact that we predict mortality is a side point. We needed a good surrogate outcome that can stand in for the question “Will benefit from advanced care planning?” We don’t have such training data, so we use death as a surrogate, with the belief that whoever is at risk of dying is likely to benefit from advanced care planning and palliative care.
When viewed with that goal, the reasons the model considers are a bit irrelevant. What we care about is getting the prediction right; because if we do, then we can help someone before it is too late.