The Center for Data Innovation spoke with Marc Zionts, chief executive officer of Automated Insights, a natural language generation company based in North Carolina. Zionts discussed natural language generation, a technology that generates written narratives from data, and explained how it helps save businesses time and make better data-driven decisions.
This interview has been edited.
Eleni Manis: Automated Insights has developed Wordsmith, software that produces text narratives about data. How does Wordsmith work?
Marc Zionts: Wordsmith relies on a technology known as natural language generation, or NLG. In the simplest of terms, NLG is the process of turning structured data into a narrative. Wordsmith processes data and creates a human-sounding narrative, including thousands of business analytics reports and even millions of fantasy football match recaps. Think of the technology as a translation tool, unlocking stories hidden within numerical datasets. An end-to-end NLG solution like Wordsmith is comprised of a few pieces. These pieces consist of the data behind the narrative, the conditional logic, the software that makes sense of that data, and the resulting content that is generated. With Wordsmith, our customers can use their unique set of data, create rules for how to talk about that data, and then produce those narratives for publication.
Manis: Wordsmith claims to be able to improve data literacy. What is data literacy, and why it is important?
Zionts: Data literacy is the ability to understand and derive meaningful information from data. Organizations are collecting mountains of data both about internal processes—sales pipelines, key performance indicators, operational metrics—and about their customers. A lot of the time, data experts at these companies are tasked with not only analyzing all of this information, but distilling it into manually written reports, through meeting presentations, or by walking decision-makers through the analytics via a phone call. It’s not an efficient process because decision-makers are knowledgeable about their specific area of expertise of the business, but not always the most technically data literate. The key to getting ahead of competition is expanding the data literacy of the analysts across the entire organization, including these decision-makers. The question is how to do that, and the answer is often through innovative solutions, like automated reporting with NLG. We’re seeing Wordsmith help analysts at major companies do higher-value work than constant manual report writing because they’re able to codify their expertise to automate this component. At the same time, it’s helping executives make informed, accurate decisions because they understand their data better in a written format that’s paired with charts and graphs, and they can act upon this information more efficiently.
Manis: NLG seems most useful when a user is generating many reports on the basis of one template: for example, the Associated Press uses Wordsmith to generate earnings reports when companies report earnings. Does NLG have benefits for a user who doesn’t need many structurally similar reports?
Zionts: To back up for a moment, a template is an entity that consists of data points, synonyms, and branches. Branches are Wordsmith’s proprietary way of determining the narrative output. Branches can be incredibly simple, i.e. “if ‘Sales Quota’ is 100% or greater, then the sales rep ‘hit’ quota,” or they can be incredibly complex formulaic expressions that are nested into each other. Synonyms are fairly self explanatory, but in Wordsmith’s case, a synonym can mean a word or even the entirety of a template. The main function of synonyms is to add variability and keep content fresh. For example, Wordsmith can produce over 100,000 variations of a three-sentence template. The data aspect is exactly what it sounds like: this is where you plug in those valuable data points that are the driving force behind the entirety of the template.
To your question, NLG can absolutely be useful in situations where more than one template is needed for a single dataset as well. One such use case revolves around highly-targeted communications, both internal and external. Externally, you can leverage user data to provide personalized landing pages, email content, and even web content. By doing so, you deliver a superior user experience and highlight the information a user needs to know quickly. We’ve had great success with our clients that leverage the technology, and the adaptability of having multiple templates, for this purpose, as the personalized content leads to higher engagement and retention rates.
Internally, a similarly strategic tactic can revolve around business intelligence—think of graphs and dashboards, even uniquely written reports. A user might be using the same dataset about regional sales, but need vastly different reports generated for the “Northeast Director of Sales” and an individual “Branch Sales Manager,” based on what’s more important to highlight for each of those specific roles. This is a solution unique to Wordsmith, as it is the only platform capable of delivering these role-based written analytics. While NLG is best used for a high volume of output, it can still be beneficial for reporting tasks that may be less frequent but that also consume a lot of manual time to write.
Manis: Computer scientists are fond of saying “garbage in, garbage out.” Can NLG diagnose problems with users’ datasets?
Zionts: If the data going into any NLG platform is “garbage” in the sense that it is inaccurate, then the conditional logic in the platform that’s driving the narrative will reflect that. A way that NLG could “diagnose” a problem with the user’s dataset ultimately comes down to being able to more easily spot a data issue due to the clear, written format of output. For example, it may be easier to notice something is awry in a dataset if the output states “your top performing rep was John Smith, with $900.2 million in sales last week,” when the narrative consumer knows that’s not possible.
Manis: Smart home company digitalSTROM uses Wordsmith to deliver its customers reports on their utility usage. What benefit does this offer customers beyond that provided by a typical utility bill with a chart showing electricity or water usage over time?
Zionts: With a typical utility bill with a chart showing usage over time, it tells you the “what is happening” but doesn’t really provide a “why” that a narrative can for connected homes like those of digitalSTROM customers. With this specific use case, the narrative output for digitalSTROM’s customers doesn’t simply tell them “your electricity usage last month was higher than it was for the same time period last year.” It can also include suggestions on how to fix that, such as “replacing the bulbs in your living room light fixtures to a more energy efficient model could help reduce costs next month.”