The Center for Data Innovation spoke to Adrian Alexa, co-founder and chief technology officer at Repositive, a UK-based start-up that provides an online portal for accessing human genome databases. Alexa discussed why it is important for researchers to be able to access data in one place, and the challenges of making more data available for medical research.
Nick Wallace: Repositive lets researchers query many different genome databases at once. Why did you and your co-founder decide to built such a portal? What problem did you want to solve?
Adrian Alexa: We care about making genomic data discoverable and accessible for researchers, both in academia and industry. This started a little over four-and-a-half years ago when my co-founder Fiona Nielsen and I were working at the biotech company Illumina. I moved to the UK about seven years ago, and started working at Illumina in February 2011. Fiona joined six months later. We were part of the computational biology group at Illumina, which was part of the R&D group there. We were trying to infer as much information as we could from sequencing genomic data.
The main drive for what became Repositive came from Fiona. She was part of the translational genomics part of the group, which means she was working with others on the team to help them understand the genomic data and going through the analysis with them. I was more at the back end, developing the methodologies and sterilizing the data. Fiona realized that repurposing the data wasn’t really the problem—Illumina was solving that already—and it wasn’t the analytics either. We had good quality data and good analytical models. The problem was getting more context—such as by combining data on different diseases in order to infer more information.
Repositive emerged out of our frustration at the fact that though we knew more and more genomic data was being produced, and that it could help us to increase the accuracy of our findings, it was impossible to get access to this data. Sometimes it was stuck in silos, sometimes nobody knew where it was, or if it was in a biopharma lab they were keeping it locked up because they didn’t want to give any IP away. So Fiona quit her job at Illumina and started working on an initiative to promote the efficient and ethical sharing of human and genomic data, and incorporated a non-profit called DNA Digest. Her ideas started to grow on me, so about a year later I quit my job at Illumina and joined Fiona, and we spun-out Repositive from DNA Digest and started developing the idea as a company.
Wallace: What does better access to genomic data mean for medicine and health services?
Alexa: It’s still a very new emerging field. The driver here is that until the beginning of this century (the first human genome was sequenced in 2003, and it cost about $3 billion) it was very hard to understand genetic diseases. Genomic data today helps us to get a snapshot of the status of a piece of DNA at a particular point in the life of an individual. This allows researchers to start understanding what drives a particular disease or condition. If you sequence a cancerous tumor, you can compare that with normal tissue, and start seeing the difference in the genome, because the genome should be similar across different cells in your body—but cancer cells are genetically mutated. That helps you isolate the genes that are causing the cancer.
Genomic data makes you more precise in how you do diagnostics, which is why many people today talk about personalized medicine and precision medicine. The idea is that instead of using a generic drug designed to treat a particular type of condition, a clinician will be able to build therapy around the individual.
Improving access to genomic data is about building a knowledge base. As we move forward, if somebody gets their genome sequenced, they can compare themselves to other people with similar characteristics—say a European female in her mid-40s—and if the knowledge base is rich enough, she will be able to compare herself to others like her with, say, a particular eye disease. We want to open up data silos so that the information will drive these kinds of advances.
Wallace: Repositive describes itself as the Airbnb for genomics. What makes you Airbnb, rather than Google search?
Alexa: As a young startup, when you try to define a new product or a new marketplace, it’s often difficult to describe what you do and how you do it, so you start looking for comparisons. There are parts of our product that could be described as the Google for genomics. There are parts better described as Airbnb. Sometimes we tell people we’re building the TripAdvisor for genomic data.
We like the comparison with the hospitality industry because of the difference between how people travel today compared to 20 or 30 years ago. If you go to Berlin for the first time in your life, 30 years ago you would have to look in a book and go to a travel agent, because otherwise you wouldn’t know which hotel to go to. But today you can go online and get community feedback about the hotels online. You can make your decision in five minutes, or spend two or three days researching to try and optimize the hell out of it if you want. Either way, it’s easier than it was 30 years ago, because you have an online community that can give you the information you need. It’s transformed the way people travel.
The comparison with Airbnb is about the community. If you offer your flat to strangers, you do not know exactly who is going to turn up at your door. You do not even know for sure that the person who books it really exists. Airbnb tells you James is going to visit you, but until James is at you door and you see that person, you don’t know—you just have to trust the system. Somehow, Airbnb managed to build this trust that allows people to move around and stay at other people’s places. We take this for granted today, but if somebody had said to you ten years ago that you can have a trusted system where people will be able to do this, you would have been skeptical. You might have asked, “what if they trick you?” and “how can you police such a system?”—but Airbnb managed to do it.
So how does this apply to us? Our system is about getting access to restricted and private repositories. There are a lot of privacy issues around genomic data, and you have a lot of repositories that are responsible for the governance of the data. If you want to access data from those repositories—which are often government-funded—it can take two to six months. You need to show them that you’re a qualified researcher, and you’re not going to misuse the data, and they have an access committee that will look at your application. And the annoying thing is, you don’t have to do this just once in your lifetime. You do this every time you need to access data from a repository—even if you’ve been to that repository before. Getting access to the data that’s useful to you takes time, and a waiting period of two-to-six months can have a serious impact on your research.
So our comparison with Airbnb is about being able to build that trust mechanism, where we can vet the community we are building with Repositive and show to the repositories that the researchers are trustworthy in order to shorten the time it takes to access data. For example, that means making a researcher’s application to access data transferable across multiple repositories.
Wallace: Consumer genomics, where people have their own genomes sequenced to access precision medicine, is a growing field. But what about the social value of genome sequencing? If more people shared their genomic data anonymously, what impact would it have on medical science?
Alexa: This is a tricky question, because one can approach it from multiple angles. The tricky bit is when someone asks you, “Why should I share my data? It’s mine, it’s private, somebody might use it to take advantage of me, like an insurer.” But if a few years later that same person develops a genetic disease, they’ll probably say, “do everything you need to do to treat me, because I don’t want to die.” At that point, their perception changes. When you have a health problem, you want everything to be available—for you. So you get into this very strange problem: we don’t want to contribute, but we want to benefit from what others contribute.
What we believe at Repositive is that while we should not be careless—we need to be careful about how genomic data is used, and how to protect people’s privacy—we also need to show what the technology can do, and we need to push for this data to be accessible. We believe that being more open with data from the beginning will allow us to have a bigger impact. More data will be available, enabling more informed decisions and allowing us to act on those decisions faster, than if we try to design a perfect solution from the beginning that accounts for all the negatives we can think of. I’m not saying we shouldn’t think about the negatives, I’m just saying that we need to be proactive.
That’s where the social value will come from. We need to create a knowledge base that will allow us to take informed decisions about how we interpret a particular state in our bodies.
Wallace: Despite all the progress science has done for us, I am confident that there are far more people who read horoscopes than know anything about their genes. Do you think advances in genomics might change this in the near future? Will ordinary people learn more about their DNA than their star signs?
Alexa: Hopefully. But I think as humans we still struggle to understand biology in general. I’m not a biologist, I’m a computer scientist, and I see biology as this super-complex system with billions or trillions of sub-systems and interactions. What we are doing right now is taking a very small peek at one millionth or billionth of this complex system, and we try to formalize that and draw a conclusion. Those conclusions are something like, “okay, I know there’s a whole big system out there, but if we act on this small bit, then hopefully something good will happen.” We’re still very, very early on in understanding how we function as an organism. We’re basically trying to formalize nature, but we are not at a point where we are able to do that.
These are very, very complex systems, so I don’t think it’s about understanding them—I don’t think an individual mind would be able to comprehend all that on its own. Instead, I envisage a very complex ecosystem of applications, tools, and platforms, that will allow us to act at different levels.
To put it another way, these days people are more aware of their health—we’re using wearables, and things that help us stay fit. I think genomics will, in a similar way, guide our behavior based on whatever findings come from sequencing our genomes. But I would not call that understanding what’s actually happening, or getting to the point where there is sufficient curiosity to find out. We’ll just act on what we’re being told by the experts.
In my opinion, the system of biology and genomics is so complex that rather than understanding it, we might understand we need to become more aware of it, rather than looking at a horoscope. But there will always be that balance between the rational and the irrational.