The Center for Data Innovation spoke with JR Helmig, chief analytics officer for federal programs at SAS, an analytics software firm headquartered in Cary, North Carolina. Helmig discussed the changing trends in how the federal government approaches analytics, as well as the biggest obstacles to the government using data effectively.
Joshua New: Your job involves identifying trends in federal analytics. What trends do you see in how the federal government uses analytics?
JR Helmig: We’ve all been following big data trends over the past several years. Cloud, Hadoop, in-memory processing, pattern recognition, and so on. Right now, cognitive computing, the Internet of Things, and machine learning are the buzz. A common trend is that the government still struggles to understand what analytics truly is, its power, and how to measure its effectiveness. Many times, “analytics” is, mistakenly, used to describe query or search capabilities, investigations, case management, or visualization. These capabilities are critical to most organizations’ mission success and should not be undervalued, but simply possessing an executive reporting dashboard is limited to just that—reporting. Analytics is so much more than reporting.
We’re seeing analytics units being challenged to support spending requests, especially during these past few years of fiscal belt-tightening, and analytics leaders requesting a budget from their chief financial officers (CFOs). Now that systems are deployed, the CFOs are asking to see the savings. So we have this ongoing challenge of quantifying the return on investment (ROI) for analytics, but doing so hinges on how you define “return.”
Consider an anti-fraud operation that prevented an increase in fraud and kept its occurrence at a steady level—this could be considered a “win.” In a government environment, prevention alone, however, makes it tough to justify increasing spending requests for analytics-driven efforts.
Part of understanding analytics is becoming familiar with how its ROI can and cannot be demonstrated. Most people, regardless of their field or level, grasp the value of tangible cost-savings. It’s assigning value to intangible results that’s tricky. What is the value of successful compliance with federal reporting guidelines? Of protecting an agency’s reputation? Of avoiding a bad decision? These questions aren’t easy to answer, but they do exemplify beneficial results that leaders, of course, wish to capture and report.
This issue isn’t insurmountable but rather is symptomatic of the growing pains that coincide with any rapid change. As the analytics market became flooded with large, as well as niche, players, innovation often outpaces the government’s risk-averse approach to procurement. I think we’ll soon see more budgetary doors open, as the government trends toward a better understanding of analytics and its value.
New: What are some of the government’s biggest obstacles to using data and analytics effectively?
Helmig: Some of the more frequently mentioned obstacles include data silos that prevent successful data integration and management, which are critical to performing analytics; an analytics skills gap, which threatens the government’s ability to maximize its technology investment; and risk-averse cultures, which slow the adoption of technologies that could help with innumerable problems.
These are all real. However, I’d also like to bring up an equally important barrier that doesn’t get mentioned nearly as much: specialization. Let’s look at the evolution of national security or analytical homeland defense communities. Early practitioners, such as field agents, often have investigative mindsets. As data exploration, analytics, and visualization tools became more common, suddenly these investigators were expected to perform the job of data analysts. While many were very skilled in a specific adversarial group, threat, or region, they didn’t necessarily have the foundational training on the methodologies. These methodologies, such as social network analysis, advanced modeling, entity resolution, anomaly detection, and risk-based scoring, are often incorporated into software tools and programs to uncover organized criminals, terrorists, fraudsters, and cyber threats. But without training on the underlying statistical reasoning or analytical methodologies themselves, experienced investigators have struggled to accurately and fully capitalize on new methods.
The government remains very specialized, which can work against it. For example, while I have represented the data, analytics, and technology communities in front of those writing regulations, that doesn’t magically give me the legal and legislative expertise to actually write a regulation and push it through Congress.
This issue with specialization often is illustrated when an agency attempts to modernize its business processes or technology systems. It’s a tough call to ask an investigator-turned-analyst to now define, at a granular, engineering level, business and technical requirements that an IT shop can deliver. I’ve seen this challenge play out many times since 9/11, as we have increased our analytical posture. Kudos to those who roll up their sleeves and make the best of it. But it is crucial that we educate and create awareness of the requisite methodologies and best practices among the analytics communities, as well as their supporting elements, such as human resources or procurement, so that more people understand and can seamlessly contribute to the procurement and requirements process for analytics projects.
New: What would be the best way to overcome these obstacles? Can better private sector partnerships alone solve these problems, or is there a need for a more systemic fix?
Helmig: We must take the time to standardize some basic terms, training, processes, and methods. Many times, executives will support or ask for the funding of a tool or methodology that, regardless of how well it is implemented, simply will never provide the speed, scalability, or accuracy necessary to achieve their mission or business goal. We must get better at articulating what is being asked for and why, including the desired outcomes and constraints, such as processing power or speed.
We also must closely examine the data types and sources, methods, and skills required to solve a given problem. This includes stopping efforts midstream if participants realize that the project, even if successfully deployed, is unlikely to make a measurable and positive impact. Resources, including money and talent, simply are too few to waste. It is far better to have exit ramps early and often, and we should support a federal leader who reacts midstream to changes in expectations or performance and reallocates resources accordingly.
Further, we must be more forward thinking when modernizing existing systems. Too often we focus on changing processes without really understanding how the mission or operational landscape might evolve. For instance, let’s say that you’re to begin a three-year enterprise-level modernization effort in January 2017 with a go-live date of January 2020. Let’s assume that this system will need to last from 2020 to 2030. Instead of just modernizing existing business processes, we should anticipate the operational demands to come during that decade. Current disruptive changes are driven by the Internet of Things, mobile devices, distributed processing, the challenges with governing open source code, and consumer demands for a better user experience, to name a few.
Private-public partnerships are critical to these efforts. Analytics industry partners are already adapting to these changes, scaling up to handle the massive data challenges of sensors, mobile devices, and more, and are developing software to analyze and manage business-to-consumer relationships or government-citizen engagements. By working with the private sector, especially companies that deliver consumer-facing applications and interfaces, government can better “predict” how humans will interact with analytics tools, visualization and systems. Granted, there isn’t a crystal ball here, but how many times have we seen a “high-tech” system deployed without the functionality of even a simple smart phone?
New: On the other hand, what does the federal government get right when it comes to data?
Helmig: Collection. The U.S. government has collected a mind-boggling amount of data. And I don’t just mean the Internal Revenue Service, Census Bureau, Social Security Administration, or Medicaid, which all collect terabytes of data. For decades, our government has been collecting imagery, photography, and video from space and other sources. Sensors from military equipment create an enormous amount of data. Newer sources, like social media and the Internet, only add to the massive pile.
Our government has invested so much in storing this data and, in many cases, does make good use of it. There are also plans to improve the use of data and reduce future costs. For example, the Census Bureau is using analytics to improve their understanding of prior censuses in order to drive down the cost of verifying data going forward. The Census Bureau is creatively designing the 2020 Census to be the most automated, modern, and dynamic decennial census in history, and is even leveraging administrative data from other agencies to ensure a more accurate count.
In addition, the Department of Commerce is crowdsourcing business development by publishing more government data. Innovative citizens can creatively merge data sources to generate new work products, thereby creating new businesses.
The data is there to do amazing things. But first, we must determine which data is important and discard what isn’t. The cloud isn’t an infinite repository. By only analyzing relevant data, agencies can save storage space and speed analyses. However, some argue that all data has the possibility of becoming useful. As problems change, we look to use historical data in novel ways. I agree with this, but that doesn’t mean we need instant recall on all data, all the time. Prioritizing data can add to storage savings.
I think the government can lead the way in this effort. The government, like many industries, is a data hoarder. So let’s capitalize on what’s been collected, while honing in on what’s truly useful.
New: I’d like to bookend this with another question about trends: What’s the “next big thing” in federal analytics?
Helmig: Hopefully we can shore up some of the non-technical aspects of analytics operations. When it comes to procurement, what if the government used more of a total cost-of-ownership approach? Or what if it considered the impacts of more accurate analysis to operations? For example, when an analytical or investigative organization is considering software or tools, why not look at analytics operations from a full ecosystem perspective? If an agency spends a little more money on a faster search or query tool that saves analysts’ time and enables more queries, then perhaps that additional cost is offset by the time saved by each query.
Similarly, analysts spend enormous amounts of time prepping data or reducing false positives, which translates into higher labor costs or missed opportunities. The labor savings from more robust data management and analytics capabilities should be considered from a cost-benefits perspective and influence the total cost of ownership view.
I also mentioned that public-private partnerships should be more prevalent. I am constantly dovetailing private sector best practices into the national security and homeland defense communities, and vice versa. There are many lessons learned and best practices to be shared between the two communities. Practitioner-level engagement on both sides would include non-technical topics, such as hiring and talent development, the proper governance of data collection and use, forecasting the impact of future regulations or innovation, etc.
As you mentioned, with the growth of open source, public-private partnerships will help agencies determine the best combination of open source and commercial technologies to pilot new projects, while ensuring that they can be fully scaled across the enterprise. These collaborations will become ever more critical to the success of government analytics projects.