Published on October 3rd, 2016 | by Alexander Kostura0
5 Q’s for Nemo Semret, Chief Technology Officer at Gro Intelligence
The Center for Data Innovation spoke with Nemo Semret, chief technology officer at Gro Intelligence, an agriculture data firm based in New York City. Semret discussed the importance of precise data in the agriculture sector as well as how machine learning helps Gro Intelligence process large amounts of diverse data.
Alex Kostura: Why are data analytics particularly important in the agricultural investment space?
Nemo Semret: There’s so much to consider in agriculture. Even if you take a narrow slice of the food industry, say of corn production, there are hundreds of thousands of businesses involved in growing, processing, and transporting the crop. When people think of “agriculture,” they think mostly of farmers. But three out of four people in agriculture are not directly involved in farming.
Agricultural data is still mostly poorly covered. Sara Menker, the chief executive officer of Gro, used to trade commodities. She likes to say that she could have accounted for the flow of nearly every molecule of natural gas. But very little data of precision is available for much of agriculture.
Right now, consequential decisions are being made without good enough intelligence. It’s not just in the space of investment: In addition, it’s production, processing, and trading crops. Small farmers don’t know if they’re getting a good price for their harvest; big buyers aren’t sure how a local drought has affected production; and traders aren’t sure what the price of a commodity will be a year from now. The interesting thing is that a lot of things are connected: If you’re a European buyer of soybeans from Argentina, you should care about production in China. If you export palm oil from Indonesia, you should care about the weather in Nigeria. Right now a lot of people aren’t operating with good information.
Agriculture is still at the early stage of modern “big data.” We’re only just starting to take advantage of large scale low-cost parallel processing, distributed storage, and so on. Other sectors, like energy, banking, and transportation are a bit ahead on that curve.
Kostura: Gro Intelligence created Clews, a web-based platform that pools and analyzes trillions of agricultural data points from a variety of sources such as government reports, satellite imagery, and weather forecasts. How does Clews make use of so much data from so many different sources?
Semret: We bring all these data together into a single platform and give them universal meaning—semantics—across different data sources to give users insight through search and discovery. In other words, it’s a classification system so that agricultural data can be structured and normalized. When Clews users want to understand the production of apples, they can focus on things that affect apples, and not things that are actually affecting oranges.
We also use machine learning to run computations so that we can predict the future based on the past. So we combine ecologically-driven processes with machine learning at large scale. We’ve applied that very successfully, for example, to U.S. corn yield forecasts. The key thing here is we’ve productized yield predictions. Compared to the U.S. Department of Agriculture’s traditional forecasts for example, ours come earlier, are more frequently updated, and thus more accurate at the time when you need it. There’s a lot of cool work to be done with machine learning in agriculture.
Kostura: Clews also includes some interactive visualization tools for users to explore the data. How important are clear visuals for clients who may or may not be familiar with data analytics?
Semret: It’s very important! Ultimately, Clews is about giving intelligence to users. So the products have to be compatible with the human mind, which is good at some things, and less so at others. For example, you can give a person a precise description of a known human face by giving exact measurements of bone structure, hair color, facial contours, and so on. You can specify this into a series of 100,000 numbers, but still nobody might recognize the person you’re describing. But if you transform that into a 100,000 pixel photograph of a face, even a baby would be able to recognize the person.
Good visualization is not just a matter of saving time or making data look pretty. It can be the difference between being able to understand something or not at all. A wheat flour miller who’s a Clews user should be able to recognize a change in wheat supply as quickly as a baby would recognize a photograph of its mother’s face. So we want to show our users the right data on the right chart or the right map.
Kostura: What do you think Gro Intelligence can accomplish at scale with better agricultural sector data for investors? What’s the long-term vision for the software you develop?
Semret: In agriculture, the interesting questions can be deceptively simple. For example: Who imports corn? How much wheat did Africa grow last year? Which countries have increased production of soybeans?
Even when there’s a simple answer, such as “here are the biggest exporters or importers,” the answer has many moving parts. Production can decrease because there was less rain, or a railway was under repair, or tariffs increased. The prices will change, which makes the quantity demanded change, but only to the extent that people are, say, unwilling to substitute corn for wheat, and only if other suppliers can’t step up production… and that’s just the start of analysis.
So Clews isn’t just a question-answering service. It’s not just a data aggregator. It’s more about providing knowledge and actual intelligence to those who need it. Clews is like a research assistant that’s specialized in agronomy, environmental science, remote sensing expert, finance, infrastructure, climate, and more. It’s not just letting you do things cheaper and more quickly. There’s a qualitative difference in the ability to answer questions in five minutes and five weeks, just as there’s a qualitative difference in sending a letter and being able to talk on the phone.
Kostura: Before joining Gro Intelligence, you were a software engineer at Google for over eight years. How do data science skills useful for a search engine apply to an agriculture-focused analytics company? What’s the connection?
Semret: Well, Google today is such a vast company that you can make connections to anything. But there are some apt analogies to Gro. Gro is helping organize the world’s agricultural information to be made accessible and useful to users. Google is more general, but Gro is a special case of structured data.
Google enabled millions of businesses to do things that only the biggest companies could do before—efficiently target advertising. Today, a small business has the same basic tools as a multinational company, and both are more efficient at reaching their potential customers than they would have been 15 years ago. With Clews, we’re giving hundreds of thousands of companies of all sizes a level of intelligence that previously only dedicated research teams could produce, and doing it continuously as a product rather than an occasional research paper or report.
Techniques and algorithms for doing large scale computations are extremely relevant. Ranking, or prediction, numerical optimization, and so on are just a few examples of computational problems that are similar between web search and ads on one hand, and agricultural data on the other.
Every startup that deals with large amounts of data has been influenced by Google’s way of doing things. As an ex-Googler it’s probably even more in my “engineering DNA.” We’re all thinking differently about infrastructure design, parallel processing, distributed storage, and commodity parts. What should be real-time, and what should be batch processed, just to take one example, are things where Google experience has a huge influence on decisions I would make at Gro.
There’s also a lot of engineering processes and management: programming techniques; issuing reliable releases; focusing on the right things at a time; planning objectives and measuring results; taking good risks and not taking bad risks, meaning recognizing whether a technical bet is smart or foolish; and much more. One of the main advantage of Google is that you get to with lots of people who are both very smart and have experienced success and failure. You don’t get just technical knowledge, but a deeper wisdom.