The Center for Data Innovation spoke to Matteo Carli, chief technology officer and founder of xbird, a Berlin-based startup that uses sensor data to help doctors and patients manage conditions like diabetes. Carli discussed the importance of contextual information to physicians treating diabetic patients, and the surprising links between activity data and a variety of medical problems.
Nick Wallace: How did you come to found xbird? What led you to the field of medical AI?
Matteo Carli: That’s a long story. We have three founders. One colleague of mine worked together with me in a different company—xbird is the second company we founded together. The first company was about mobile advertising. But it was mobile advertising built towards data science, machine learning, and artificial intelligence—it was a very data-driven business. It was quite successful and we got acquired, so it was a “successful exit.”
We managed to create very precise profiles of people by just looking at the way they were using their mobile phones. What we were doing, in a nutshell, was targeted advertising. We were able to create profiles of people, and create different “buckets” to put those people in, and our clients were able to target specific “buckets.” Just to give you a concrete example, let’s say a car maker wanted to target a businessman of 45-55 years old, with a high income, and so on. We were able to do that by looking at what kind of behavior you have with your phone, what kind of apps you have installed, whether you show up in airports on weekdays—then you’d be flagged as a business traveller, and more likely to be in a certain bracket in terms of income, a certain bracket in terms of age, and so on.
It was cool, it was an interesting use of technology, and as I said, we were quite successful, but we found ourselves in the position of saying, “this is a bit shallow.” You get all this awesome technology to increase the return on investment of an app by 1.2 percent, which in some is awesome—you get €1,000,000 from Coca Cola, or a big brand like that—but you don’t have any really meaningful impact on society. So when we exited that business, we took a few months of vacation to regroup, and then we sat down with my colleague and my friend, and we talked about what we could use our technology for in order to have a bigger impact and to do something more meaningful.
One of the first things we thought of was healthcare. We saw this big amount data that you can collect from a smartphone, and then we started looking into wearables, and it was even more data, and even more medically-relevant data. There was a lot of studies that clearly documented that behavior and environment can directly influence your health. There are entire groups of conditions that are 100 percent related to your environment and your behavior. And that discussion went along with my family history: both my parents died of preventable diseases. So I was saying, “OK, is there any way to recognize and prevent diseases by looking at how your behavior changes through time, or how your environment influences through time?” That was the starting point.
Wallace: What kinds of data does xbird use, and how does this helps doctors and patients manage health conditions?
Carli: Right now we are only using Apple devices: iPhone and Apple Watch, and we extract millions of data points from their sensors—the sensors being the GPS, the accelerometer, the gyroscope, and the heart monitor in the Apple Watch. We combine that data to create what we call the continuous activity monitor, which translates this data into activities and health events. This means we can see if you are walking or tango dancing or ice skating, or if you’ve had food or not, how long you slept, how much time you spent at home, how much time you spent commuting—and how you commuted, whether it was by car or by bike or by train. This all happens without any input from the patient. You don’t have to log any information in a diary, everything happens automatically. We collect the data from the phone and run it through our machine learning algorithms, which automatically create a list of events. Then we relate that information to specific health conditions.
We started with diabetes because it’s very much a “data-driven” disease. If you have diabetes, you need to monitor your blood glucose. You do that either by pricking your finger and testing the blood, or you can continuously measure your blood glucose using a Continuous Glucose Monitor (CGM), which is a patch that feeds information to a device, which could even be your smart phone. CGM’s have existed for about ten years, and they provide a lot more information than traditional finger-pricking—because while you might do that three to five times a day, the CGM records information every five minutes.
Right now, a diabetic patient will go to a doctor every three to six months, with a 50-page report with blood glucose measured every five minutes. But the doctor is not a statistician: there’s an interpretation problem, and there’s a problem in the fact the doctor has about ten minutes to see the patient. So the amount of data has increased a lot, but the time in which to process it hasn’t. And although that information is extremely relevant, there isn’t any context to it. So if you have a certain level of blood glucose at any given time, it could mean one thing or another depending on what you were doing at that time.
We did a lot of interviews with patients and doctors in the field, and you get situations where the doctor sees that something is wrong—for instance, you had a hypoglycemic event, where your blood glucose went below a certain threshold—and the doctor would ask, “What were you doing at 3 PM on January 17?” I don’t know about you, but I can’t remember what I was doing yesterday at 3 PM. Theoretically, a diabetic patient should keep a specific diary of their life and have a regular routine, but in reality that never happens. If you have type 1 diabetes, you have it from birth for your whole life, so the chances that you’re keeping an up-to-date diary for all that time is extremely low.
But with what we provide, the doctor can see that you had a problem, and we’re able to say, “you were having a cup of coffee at home, sitting on your couch, and that has an impact on your blood glucose.” Then you can start looking at your insulin intake, maybe that’s the problem. Or, to give a simple example, maybe you were running a marathon—then the hypoglycemic event makes a lot of sense, you were burning a lot of energy, and maybe you didn’t eat enough carbohydrates to equilibrate the amount of energy that you were burning.
So just by providing the behavioral and contextual information on top of this simple blood glucose data, you can provide incredible value to doctors and patients. On the doctor’s side, they have a better diagnostic tool, and on the patient’s side, they can start realizing what the activities and events are that impact their health. Because those harmful effects aren’t all that clear to patients, even when the studies are out there, they don’t always make the immediate connection of, “when you do this, this happens, every time.” Even when something bad happens to you, you don’t necessarily connect it to the cause. It’s surprising to see how many people don’t understand the link between sleep and diabetes: how much you slept, how regular your sleep is, when you go to sleep, when you wake up—the impact of all that on blood glucose is something that even for type 1 diabetics with 30 or 40 years of experience often don’t know.
Wallace: How do you approach the problem of separating causality from mere correlation? For example, we know the divorce rate in Maine strongly correlates to per-capita consumption of margarine, but it seems unlikely that either one causes the other—so how do you identify links between hypoglycemic attacks and patterns in the data you collect?
Carli: We distinguish it first with medical studies. We always refer every alert, every message, and every suggestion to previous studies, previous clinical trials, and research. That’s one thing.
The second thing is, parallel to that, even though we are not yet providing an additional layer of personalized expertise on top of that research, we are planning to go in that direction slowly with input from experienced doctors. That will provide our training set. We are already collecting information from the doctors to identify what has an impact on each specific case, instead of being like the example you gave with margarine.
Another thing we’re doing, rather than making direct claims of causality, is looking at high-risk locations. For example, we might identify your going into McDonalds as risky behavior, and we alert you to that, so that you know when you’re entering that area you might have a risk, but it’s one you can judge for yourself. Maybe you’re just passing by, or you stop to buy something that’s not even food, but you might still look like you’re in a high-risk location.
That doesn’t mean that is not useful information, because you’re not getting an imperative alert saying, “you need to take insulin,” or “you’re having a hypoglycemic attack.” You get an alert that says, “this is something we saw in your pattern, make of it what you will: your blood glucose tends to rise when you go here.” Whether that provides value for the patient depends on the case. It might be irrelevant in some instances, but it could also provide the patient with an insight into their own behavior.
Sometimes we go even more detailed, depending on the patterns that we see. We might say, “every Wednesday, this is a higher risk location for you, your blood glucose rises here.” And maybe then it becomes clear to you, “yes, I go here every Wednesday and have a sandwich, so maybe I should not do that.” It doesn’t show causality between location and effect, but it provides a lot of contextual information for the patient to think about their own behavior and how it influences their health. It’s sort of the first step in the direction of creating a clear pattern of causality, it already provides value in terms of education.
Wallace: What can this kind of data analysis add to our understanding of diabetes in general? Have you uncovered anything surprising?
Carli: We already ran our first clinical study with our application on iPhones and Apple Watches, together with CGM devices. I can’t say too much about it because it’s not published yet. But what I can say is that we were able to increase the accuracy of blood glucose predictions—we were able to tell individual patients what their blood glucose would be in the future. Our method is better than just looking at blood glucose levels: adding mobile data increases the accuracy of predictions. That could potentially open up the door to being able to say, “we can tell you with sufficient accuracy what your blood glucose will be in an hour, or two hours.” With that, a lot of hyperglycemic and hypoglycemic events could be eradicated.
Wallace: Besides diabetes, what other conditions could this kind of data be used for in the future? What do you plan to work on next?
We looked into a lot of different diseases before starting with diabetes. As I mentioned before, we chose diabetes for its data-driven environment. We have a lot of very interesting data points with blood glucose, so it made our job easier. But we also looked into cardiovascular diseases, which are basically the biggest concern in healthcare before diabetes. We looked into Parkinson’s disease, because we could use sensors to identify the intensity of tremors.
We looked into mental health, where behavior, the way you move, the number of locations that you visit, the number of times that you pick up you phone, and the kind of places that you visit, have a strong connection. We ran a test where we were able to identify postnatal depression by only looking at Twitter profiles—we basically reproduced a study somebody else had done, because we were curious and we weren’t convinced it was that easy, but we confirmed their results. Postnatal depression is one of those diseases that is somewhat under the blanket, it seems rare, but it’s not, something like 25 percent of mothers suffer postnatal depression.
In one respect, everyone knows that behavior and environment has an impact on your health. But it’s only recently that we can objectively measure your behavior or your environment with sensors. Until five years ago, all this stuff was measured via questionnaires, even in the best clinical studies. Quality and duration of your sleep would’ve been a questionnaire that you would fill in from one to five at the end of a three-week study. It’s measured it’s validated, but it’s subjective—now we have the tools to improve on that, to make it less subjective. That’s the direction we want to go in, and where we see the entire healthcare industry going in the future.