The Center for Data Innovation spoke with Lars Maaløe, co-founder and chief technology officer of Corti, a Danish startup that has produced an AI assistant that analyzes the dialogue of emergency calls to predict whether a patient is suffering a cardiac arrest and guide the operator accordingly. Maaløe discussed the challenges faced by emergency services call handlers, and how an AI can help focus dispatchers’ attention on the clues that matter most.
Nick Wallace: What is Corti, and how does it work?
Lars Maaløe: What Corti does is listen in on a conversation between a dispatcher and a caller. So someone calling 911 or 112 or 000 will have a victim beside him or her, and the dispatcher needs to determine whether they need an ambulance or not. The dissimilation of information between a caller and a dispatcher needs to be as clear as possible, and it’s this flow that we tap into.
We listen in to the semantics of the conversation, such that we can pick up some semantics that correlate with historical calls that our AI has trained upon. When there is a clear-cut correlation with a certain bundle of calls that have specific diagnostics, then our systems can quickly figure out that out.
In the call stream between the caller and dispatcher, our system will give feedback to the dispatcher if the system sees a strong correlation. It could be that there is a cardiac arrest, it could also be keywords that are significant to the meaning of the conversation. There is a probability that the dispatcher is also in a highly stressful environment, they need to take care of a multitude of situations at the same time: dispatching the ambulance, finding out whether there is some practicalities that they need to inform the ambulance driver of, whether they have a lot of stairs to climb, and so forth.
There is quickly some clues in the conversation that can be dropped, and having a system that can remind the dispatcher of those clues in a seamless manner is vital. On top of that, having a system that is able to correlate those clues with diagnostics is also vital.
Wallace: Emergency call operators are trained to ask pertinent questions in order to get as much information as they can about the patient, including whether they’re having a cardiac arrest. So what can AI do here that humans cannot? What problem is Corti trying to solve?
Maaløe: There is a problem, since in these conversations humans make mistakes, and of course they do—we all do. These calls are not clear cut. Whether a person is breathing may seem clear cut if you don’t listen to the calls, but when you start listening to the calls, you realize it’s never a yes or no. Nothing is binary in these calls. Interpretations can be misleading, and if there is a misleading clue in the call then the dispatcher gets thrown-off and follows the protocol in a different direction from what the call is about.
What Corti can do if this happens is guide the dispatcher back on track, to make sure that if something correlates with historical calls, maybe an answer that’s ambiguous in some manner but still correlates with similar ambiguity in a bundle of other cardiac arrest calls, the system can say “this is probably still a cardiac arrest call.”
So even if the caller says the victim is breathing, that may not be the whole truth. It’s about finding the granularities in the semantics of the language, and that can be difficult for someone in a stressful situation where they need to make a decision fast.
The caller might give a clue at the beginning of the conversation—they might say that the victim has blue lips. Then the conversation starts, the dispatcher starts asking questions about address and so forth, and tries to follow the protocol and asks them, “is the victim breathing?” Then the caller might answer, “I believe so.” Then it’s quite easy to drop the clue that the caller had said the victim’s lips were blue in the beginning. A human might say, “ok, it’s not a cardiac arrest because the victim is still breathing.” But that could easily be a symptom called agonal breathing. It may seem to a non-medical professional—including myself—that the victim is still breathing. But it’s basically cramps resulting from a lack of oxygen.
The system in that case can look through the whole span of the conversation, take the whole thing in, and then produce predictions every second, so the more information it gets throughout the conversation, the smarter it gets. Then it can correlate the fact of the caller saying the lips were blue, or other subtle clues that indicate the victim was suffering from agonal breathing, which may lead the caller to give an ambiguous answer when asked whether the victim is breathing.
That’s where Corti plays a rather large role. Another thing to take into account is that this system never gets fatigued. It keeps on predicting every second in all conversations.
The main point is that Corti doesn’t do anything that a human couldn’t do in a completely clear state of mind in an environment free of mistakes. But it’s possible to miss clues and cues when you’re trying to act fast. A condition like cardiac arrest only accounts for about one percent of calls, so it’s only one percent of cases where you need to decipher all of these clues and account for them all.
Wallace: One of the difficulties with an emergency call is that you have to rely on the caller to give you the right information, which might be a problem if they’re distraught or confused, especially if they’re the one with the emergency. Obviously, that’s why emergency call operators get special training—but what kind of challenges does that pose for an unconscious AI system?
Maaløe: What we knew from the beginning was that we needed to make this whole machine learning framework data-driven, with data from the real population, without too many engineering steps that would introduce too many assumptions. We can quickly make assumptions on the perfect data and the perfect line of questioning and so forth. But if we expected everything to be perfect, we probably wouldn’t add much to solving the problem.
When people are hard to understand, or ambiguous when they talk, that’s when it gets hard—but that’s also where it gets hard for human beings too. So we’ve trained these models to be very resilient to noisy environments. We’re probably not the best at performing automatic speech recognition for transcribing audiobooks, for instance. But we can be extremely efficient in an environment like this, with conversational data where sentences are almost never finished, and where there’s a lot of background noise. The algorithm is trained to get the right semantics out of the conversation and match it with the diagnostics.
Wallace: Could this technology be used for other conditions besides cardiac arrest? What made you choose that first over other conditions, like a stroke?
Maaløe: Let me take the second half first. Cardiac arrest is one of the most critical conditions that a dispatcher needs to deal with, and there is a stringent set of tasks that they need to follow. Time is of the essence, and accuracy of detection is of the essence. For every minute that the caller doesn’t start resuscitation, then the victim’s chance of survival drops by seven to ten percent. On top of that, cardiac arrests, within the field of emergency services, is one of the most researched topics, so there is a multitude of benchmarks throughout medical departments in various countries. For us to do something like this, we need to benchmark quite heavily so that we can see that our systems perform as we believe they should do. Therefore, cardiac arrest is a very good starting use-case, because it’s a solid benchmark that we can use and compare to results published in scientific journals.
To answer the first part of your question, regarding how easy it is to apply our systems to other conditions: since we have built this whole framework as a data-driven framework, it’s quite easy for us, with the right data and the right focus, to add another pathology to our system. It’s also then more cumbersome to maintain and to make sure new models will be retrained with this new knowledge. That’s one of the reasons why we’ve built a back-end product for quality assurance and quality improvement, where the quality assurance manager can go in and see our predictions and ensure that our predictions were correct and that the dispatchers’ predictions were correct. But the whole framework is set up so that it’s simple to add new detections.
Wallace: Your professional and academic background is in computer science and machine learning. How did you end up working on medical matters, and how did Corti come into being?
Maaløe: Corti came into existence a little bit more than two years ago. This use case came from the problem that medical personnel don’t get that much help. There will be, throughout the world, more pressure on medical professionals because there will be more old people in the system, and more people overall. These medicals professionals could use help from an assistant that can see patterns that they store in their minds, because there are so many different pathologies. These people are extremely talented, but why shouldn’t they have an assistant with them?
Then comes the next problem of how can an assistant for a medical professional be defined? That’s an extremely difficult use case, and a lot of companies are still trying that out. It might be help with the journals and so on, but we see the way to a good assistant is for it to be part of the conversation that the medical professional has with the patient. If you are to utilize technology to help the doctor make good decisions, then you need to tap into that conversation.
Then we are at the next part, which is technical feasibility of tapping into communication. We are still in the beginning of being really good at that. You can see some automatic speech recognition benchmarks from the big companies, but you can also still see that your voice assistant on your smartphone is still not very efficient in helping you every day. It cannot predict what your next step is.
What we found is a place where we believe we can make a change, and that was medical emergency services, where there is a dispatcher that is in a very stressful situation. They need to make decisions very fast, they need to be very concise. A lot of these conversations are stored on hardware, so we can train our algorithms from them. The conversations are somewhat structured, so the dispatcher has a tendency to follow some sort of protocol, with some differentiation between different centers. That helps our models quite a lot, that one side at least is quite structured and that structure leads towards something. That makes it a very good scenario for somewhere we can, from our machine learning and AI field, make a difference in decision-making.
We are three co-founders. Andreas Cleve had a startup before, which got acquired very early on. He was eager to start something new, and he met Michael Reibel Boesen, our other co-founder, who has a background in software and hardware and also did a stint at NASA’s Jet Propulsion Laboratory. Then they reached out to me after getting my name from a machine learning community. My background is in computer science, with a lot of statistics and mathematics as an engineer, a PhD in machine learning, and then a stint at Apple. That’s the founding team of the company.