The Center for Data Innovation spoke with Franck Carassus, co-founder of OpenDataSoft, an open data publishing firm headquartered in Paris. Carassus discussed the role of open data in the public and private sectors and what he sees as the “second wave” of open data.
This interview has been edited for clarity.
Nick Wallace: You use the OpenDataSoft platform both for open data and for internal data sharing. How can these two functions complement each other?
Franck Carassus: From the start we were absolutely convinced that open data was a big subject, but not only for government. The idea was also to be able to the answer data sharing and data publishing question for non-government customers. Our first customers were the City of Paris, Veolia (which is a large utility company in Europe), and a worldwide company for waste and water management.
We use open data as a way to illustrate the importance of cloud services. We founded the company five years ago and cloud services were not used as much as today. Even today, in some European companies, or specific use cases, cloud still isn’t seen as the obvious choice, because there is a huge culture of on premise systems, although that’s changing. For us, open data is a kind of door-opener to show governments that using this cloud approach to break internal silos and “storify” data (that is, derive intelligible conclusions from it), and give citizens access through visualizations and APIs, is not a big deal, it’s easy to do. Once governments realized that in a few weeks they could use our technology to publish their data, with only a fairly small change management process, it was a good time for us to say “you were able to satisfy your citizens’ needs, so why not do the same for internal use, because you also are starving from a lack of internal data sharing.”
For us, open data is the first application where we can demonstrate easily and with zero risk what our platform is able to do. Many of our customers are using the platform for internal and external data sharing. In fact, it’s very complimentary, because when you have a lot of datasets that you don’t want to publish for just anyone, you’d need to apply some security rules to stop that from happening—using a single platform to control everything that you share and secure externally and internally makes this process a lot easier.
Another big topic in Europe is that some of our European customers don’t want to rely on a U.S cloud provider for sensitive data. Our public cloud offerings use Amazon Web Services (AWS) and Microsoft Azure, but when European customers say “I’m want to user a cloud-based platform to share data internally, or with my partners and customers, but since the data is sensitive I don’t want to rely on a U.S. cloud,” we have an offer where we guarantee their data will be stored on what we call a “national cloud.” (a cloud server in the client’s country) We have this multi-cloud approach because we come from Europe—it makes no sense to do that in the United States. If you’re a U.S. company from the start, you might not understand that some data cannot be stored on an American cloud server; but coming from Europe, we know that some customers want to be sure that their data is not stored outside of Europe, or even their own country.
Keeping the data within the EU is relatively easy, and you can also do that with U.S. cloud providers. We’re also seeing providers like AWS and Microsoft developing specific national cloud offerings based on their technology. But in France, we work with ten or twelve utility companies that need to share their data and are willing to pay a premium to store their data in France. I don’t think it’s really a technical issue; it’s more a communications issue. It isn’t a security concern—it just lets them say that they’ve done it.
That isn’t an issue for the United States, but for Canada it could be—because Canadian customers are also interested in relying on non-U.S. cloud providers. Most of the time, these concerns are associated with data that isn’t for publishing—we have customers who start with open data and don’t care where the data is stored, because by definition open data is something you want others to have. But when it comes to data sharing, which can be more sensitive as it involves dealing with the customer’s partners or a specific internal use case, customers say, “we want the data to be stored nationally.” The more sensitive the data or the use case, the more we are asked to store data in the customer’s country.
Wallace: There’s some anxiety in the European Union about Europe’s ability to compete in the data economy. As the founder of a French open data firm that has been able to expand into the United States, what advice do you have for policymakers who want to support Europe’s data economy?
Carassus: I don’t know about all the other European countries, but at least in France, we have very good tax incentives for keeping research and development (R&D) in France. All the startups I know keep their R&D in France, for a few reasons. We have very good engineers, and they are less expensive than in the United States, but also because if you keep your R&D in France, you get crédit d’impôt recherche (CIR—tax credits for research), and every French startup relies on that. Going to the United States is more about sales and marketing, because the United States market is a more of a single market. We have customers in about thirteen countries, but each time we open up in a new European country we need to work on the language. For Germany we need a user interface in German, same for Spanish—even if we pretend the EU is one single market, when we’re selling to government, they want their data portal to be accessible in their own languages. That’s a huge amount of work.
If you want to be in the worldwide competition on software, you need to be in the United States. And if you do well in the United States, you’re more credible in Europe—even if you’re a European company. The U.S. market is much more educated about data because open data was a major focus of the Obama administration. So they got a bit of a head start on building the ecosystem before Europe. Big companies in our space, like Socrata and other vendors, have evangelized the market. We don’t need to do that in the United States because everyone on the state, local, federal level are already aware of the benefits of open data
It’s more of a tools question: we need to sell our platform instead of Socrata’s. But in some countries in Europe, we need to explain that open data is a good thing and that we have a good platform for governments to leverage local partners and civic developers. It isn’t the same pitch in the United States, as U.S. companies know open data is something very valuable and that they can use data sharing tools to break internal silos and improve communication. So we want to be the best tool for them to do that. The software-as-a-service (SaaS) business model makes it easier for them to change tools every year, which means as a company you have the opportunity to challenge the status quo every year. That’s a very different market and that’s why the United States is very interesting to us.
Something else about expanding into the United States is the need to be recognized by U.S. analysts, like Gartner, Forrester, and IDC. Even if your biggest customer is a huge city like Paris, they don’t care. They prefer a small city in the United States. And lastly, having U.S. investors like Salesforce gives us credibility in the market with customers, and with future investors VCs. And if we compare OpenDataSoft to our U.S. competition—Socrata and OpenGov—when we raised $5 million, those guys raised$55 million. With a stronger presence in the United States, we will be seen as a U.S. company, which is important to attract investors.
I think it’s very complex for Europe to have a single focal point for innovation around data and software, like the United States has with Silicon Valley. There are many European cities that want to take the lead in Europe, including Paris, London, and Berlin, and they’d like to be a kind of Silicon Valley for Europe. But every city is competing for it, and every city belongs to a country. Even if Europe is supposed to be speaking with one voice, the reality is that every country has its own agenda—you’ve just seen that with the UK and Brexit. I think it’s difficult for Europe to have Silicon Valley-like areas for data innovation that could be more efficient than the United States or any other big country. The market isn’t integrated enough—although the euro helps.
Wallace: Only a fairly small number of countries perform well on open data benchmarks, such as the Open Data Barometer. The Center noticed large disparities when indexed the G8 countries in 2015. What do you think holds governments back from doing more to support open data?
Carassus: I don’t really know if it’s a lack of time or a lack of willingness. I think open data starts with transparency, and that depends on the agenda of a country. In May this year we will have a presidential election in France. Some of the candidates are pushing open data because they want more transparency. They use this digital tool to say “we want every citizen to have the possibility to challenge the ministries and the state with data.” But I think open data is still a way to say “we are transparent.” The policymakers may not yet be aware of the value of providing open data for reuse by private companies.
I think that’s a big difference between Europe and the United States. When you look at U.S. companies involved in real estate or the service industry, they are heavily reusing open data. When you look at companies like Trulia or Zillow when you are looking to buy or rent a house in the United States, besides the price, they give you all the open data information related to an address, such as crime reports and the quality of the schools.
Thanks to open data, companies that did not exist six or seven years ago are heavily reusing public data to create value and jobs, and they’re paying taxes. I think politicians in Europe don’t really see that. Or if they do, they think the first to take advantage of the data will be U.S. startups rather than European companies. For example, if you publish restaurant data, they may be afraid that a startup like Deliveroo or Grubhub will be the first to use the data to strengthen their worldwide presence. But more than that, they don’t understand the power of public data for creating value for startups.
Wallace: It’s easier than it used to be to make the case for why governments should open their data, but the value proposition isn’t always as obvious in the private sector. How do you convince a private company to open their data?
Carassus: The way we convince them is by showing it is a way to be more transparent with their customers. We have customers in the oil and gas industry, for example, and for these customers, we started with just internal projects but now they’re interested in publishing data to be more transparent with customers, and thus more competitive, than other companies.
I hope that in the coming months some new industries will have turn to an open data approach, such as pharmaceutical companies for clinical trials. I think that in the coming years, every major private company will have an open data approach for some of their data, to demonstrate they have nothing to hide—on some topics, at least. Of course, they’ll select the topics that are most interesting to them, or that demonstrate the things they want their customers, partners, or competitors to see. Open data is a way for marketing departments to use data in their campaigns.
But we also see that some companies already understand the value of open data. Many of our private sector customers are heavily reusing open data. They take Organisation for Economic Cooperation and Development (OECD) or geographic information system (GIS) information, for example, and mix that with their own data to be able to better understand an issue or build new products. That’s why we think private companies are the future of open data, because they are the ones who reuse it, and are able to build offers on it.
I don’t think the open data movement in government is going to stop—we’ll see more portals open, but I don’t think we’ll see any close. The smartest companies reusing that data will be very well positioned in the market. And some companies are interested in exchanging data with cities because companies know they have information that will be useful to cities, but they also want real-time information that affects them. I think these use cases are coming.
Wallace: What changes do you expect to see in the way we handle open data over the next five years, and how do you expect that to affect the work you do?
Carassus: We’ll have more real-time, live streaming data from open data portals, particularly in transportation and utilities, thanks to smart meters. I think this could lead to very interesting reuses and the development of new applications. Right now, the value of open data isn’t always obvious. As more data comes from new dynamic sources, I think new uses cases will appear. We see that as a kind of second wave of open data portals. It’s just begun, but I think it’s going to be very interesting in the coming years.
I see streaming data as the new data source for open data platforms. Big data is not a big deal for technology anymore. If you need to handle billions of pieces of information per hour, you can do that now with the cloud, and it’s cheap. Then there’s the sensors: a few years ago, sensors were expensive, and there were no dedicated networks. Now you have the Internet-of-Things-focused network players like SigFox and the LoRa Alliance, and devices that are very cheap.
I think the other big topic will be monetizing data. At the moment, the return on investment for open data platforms are unknown. Before open data in France, we had public agencies that made a business out of selling data. When the French government pushed open data, they said “we want all of the data to be free to anyone, and if there are ways to monetize it, it will be for the market to do.”
But now in some countries, we have governments reselling data services, rather than data. For example, the French government used open data to build the company register, but uses a “freemium” model for access. The data and APIs are free, as is the right to reuse the data, but if you’re an insurance company, or a bank, and you’re heavily reusing the data, or you want some additional features like real-time data, you have to pay. We see more and more organizations thinking like that data should be free for some use cases, but requires additional permissions for some commercial use.
That’s exactly the case with the transport authority in Paris—they’re using our platform to retrieve data in real-time, but the usage is limited to about 20,000 API calls per day, per user. That’s fine if you’re a citizen or a small startup, but if you’re a Citymapper or a Google, and you want to use that for your services, you need to subscribe and talk to the transport authority. Maybe you’ll pay or maybe not, but at least you need to be identified as a heavy user. Open data is not only a subject for the public sector. Every commercial company will have some kind of data publishing strategy—even if they don’t call it open data, it’ll be something like that.