5 Q’s on Data Innovation with Hudson Hollister
Hudson Hollister is founder and executive director of the Data Transparency Coalition, a trade association that is advocating for policies that will require federal agencies to publish their data online using standardized, machine-readable, non-proprietary identifiers and markup languages. I asked Hudson to give me his take on how data transparency is unfolding in the federal government.
Castro: You’ve been leading the charge in the call for more open data in government. How does data transparency improve government?
Hollister: For government, data transparency means that public information is both published online and also electronically standardized in a way that makes it searchable and useful. Data transparency allows citizens to track what their government is doing. Data transparency also allows a government to better manage itself. Since there are so many separate silos within any government, the best way to make sure that public information is available to all managers and staff who need it is simply to publish it.
Data transparency isn’t merely good for government. In a democracy, data transparency is an obligation. Public information should be recognized as a public resource. The taxpayers who paid for its creation and collection should have full access to it.
And data transparency is good for the tech industry. The members of the Data Transparency Coalition, led by Teradata Corporation, understand that when more government data is published and standardized, they’ll be able to use it for all sorts of new business opportunities.
We need data transparency for just about every type of information that any government generates (or requires someone else to report): spending data, management and performance reports, regulatory rules and filings, legislative actions, and judicial documents. And unfortunately the U.S. government does not deliver data transparency in any those five overlapping areas.
Castro: Can you give some examples, either in the United States or in other countries, of where the government has used data successfully?
Hollister: We’re nowhere near true data transparency in the United States, but we do have some examples from innovative agencies that give us a hint of what’s possible.
In the spending area, we got our first taste of data transparency from the Recovery Accountability and Transparency Board, the temporary agency that was created to oversee the U.S. federal stimulus spending starting in 2009. The Recovery Board decided to take all the reports that grantees and contractors who received stimulus money were submitting to 28 separate federal agencies, put them in a standardized XML format, and publish them online for everyone to see on Recovery.gov. This allowed both the public and the government a more complete, accurate, and searchable view of spending – albeit only stimulus spending, not all spending – than ever before. And both the public and the government made good use of it. Activists used Recovery.gov to find local examples of both successful stimulus projects and wasteful ones and used them to call for change. Inspectors general at all the agencies used Recovery.gov to deploy sophisticated data analysis tools to find fraud. They recovered $40 million from questionable grantees and contractors and they prevented an additional $30 million from being paid out in the first place.
In the regulatory area, one good example of data transparency comes from the Securities and Exchange Commission. In 2009, the SEC started requiring public companies to submit their financial statements in the XBRL format as well as in plain text. This means that every number has an individual electronic tag, making it possible for investors to track companies’ performance across time and against competitors’ without having to enter the data into their own systems or spreadsheets (or pay someone else to do that). The SEC’s system isn’t anywhere near complete. It only applies to financial statements and doesn’t cover the other information that companies submit. But it’s a great start, and tech start-ups are inventing software that uses this data to make financial analysis faster, better, and cheaper.
Castro: Both the House and the Senate have introduced versions of the DATA Act. What would this legislation do?
Hollister: The Digital Accountability and Transparency Act, or DATA Act, would essentially expand the Recovery Board’s approach to all U.S. federal spending. This proposal would require the executive branch to publish its budget actions, grants and contracts, and disbursements on one website. It would also require standardized identifiers and markup languages to make this information searchable and machine-readable.
All this information is already being reported and collected. But it’s managed by four different agencies; some of the systems are public and others are not; and nobody has even tried to come up with common data identifiers or formats.
The House of Representatives passed the DATA Act – unanimously! – last April. Then, last September, it was introduced in the Senate by a Democrat and a Republican. But there wasn’t time for it to go through committee in the Senate, so the bill died when the Congressional session ended. The Data Transparency Coalition is campaigning for the re-introduction of the DATA Act in the new 113th Congress.
The DATA Act would help the U.S. government move toward data transparency for spending. We are hoping to pursue similar proposals in the other four areas as well.
Castro: It seems like some steps are already being taken to create a more open government. Why is federal data transparency legislation necessary?
Hollister: Open government in the United States got a lot of attention when the Obama Administration announced in 2009 that agencies would be required to publish “high-value data sets” in machine-readable formats. The administration has indeed put a good deal of effort into building the electronic infrastructure that will be needed to publish standardized government data.
But despite all that attention, the most important data sets in all five of the areas I mentioned – spending, management/performance, regulation, legislation, judicial – are no more transparent than they were before.
Let’s take a look at spending data, for example. The flagship U.S. government spending website, USASpending.gov, provides nothing like real data transparency. First, USASpending.gov is incomplete – it only shows grants and contracts while ignoring internal expenditures. Plus it doesn’t show individual payments, just total amounts. Second, USASpending.gov isn’t fully searchable. There’s no way to view all the contracts that a particular company received, because without reliable identifiers the same company might have several different listings in the system. Third, its data is inaccurate because, without standardization, there’s no way to check the data against other systems for quality. The DATA Act would transform USASpending.gov into a complete, fully searchable, and reliable portal by applying the Recovery Board’s approach: publish everything, not just summaries or selections; and standardize it.
In the other four areas, the situation is the same or worse. Why is this? Because the most important data sets – the ones that show what the government is doing and what regulated entities are doing – are usually managed by more than one agency, or by more than one office within an agency.
There is only one way to get multiple agencies and offices to move toward data transparency, and that’s a legislative mandate.
As everyone from Washington Post columnist Dana Milbank to the Government Accountability Office to the American Institute of CPAs to the former chairman of the Recovery Board has said – we need legislation to achieve transparency in federal spending. We need to pass the DATA Act.
And eventually, we’ll probably need similar mandates for other types of federal data.
Castro: State and local governments also produce a lot of data. What advice would you offer state and local government leaders who are thinking about data transparency?
Hollister: Eventually, I hope our Coalition will have the resources to work with state legislatures and agencies the way we’re already engaging Congress and executive branch leaders.
But many state and local governments are already making great strides in data transparency. I’d just encourage them to pursue both principles – publish everything, and also standardize it – and to recognize that to achieve cross-agency and cross-office cooperation sometimes legal mandates are necessary.