The Center for Data Innovation spoke with Jonathan Marks, cofounder of Quorum, an online legislative platform based in Washington, DC. Marks discussed how the Quorum platform automatically pulls information from a variety of nontraditional data sources to develop new political insights and explained some of the challenges of analyzing legislative data that can differ substantially from state to state.
Joshua New: Quorum’s tagline is “data driven politics.” Could you discuss some of the aspects of the Quorum platform that demonstrate this?
Jonathan Marks: The original idea behind Quorum was to make it easy for legislative professionals to use quantitative insights to help inform their decision-making about members of Congress. We started out with statistics such as which members work together frequently, which work across the aisle most often, and which are most active or effective in a given issue area.
In the year since our initial product launched this January, however, our product has grown far beyond those initial analytics and has expanded into a comprehensive platform that enables integrated legislative tracking, targeting, outreach, and project management. We continue to be a data-focused company, whether we are building tools to help users identify which members are talking about each other the most or creating interfaces that help our users identify which offices they have the strongest relationships with. Helping our clients make better decisions informed by data remains our main priority.
New: Quorum is a new entrant into the pretty well established field of political tracking and analysis, and services like Bloomberg Government and CQ Roll Call are widely used in the lobbying and public policy worlds. How does Quorum stand out?
Marks: Quorum is the next generation of these services. We make our product easy to search and use, and we include a lot of valuable data sources, such as Census Bureau data, tweets, Facebook posts, and press releases. Most importantly, our algorithms provide quantitative insights that help our users figure out what this data actually means, rather than just presenting them with the data, to help them learn more.
New: You have credited Quorum’s initial success, in part, to the use of Amazon Web Services (AWS), a cloud services provider. Why was this so important?
Marks: AWS has been a large part of why we were able to get off the ground and running as quickly as we did. Through a partnership with the Harvard Innovation Lab—a Harvard University initiative to encourage entrepreneurship—AWS provided us with the necessary server space and resources that allowed us to devote time to building a fully-functional product without having to worry about raising money to cover huge development costs from the start. Had we wanted to build Quorum 10 years ago, we would have had to spend hundreds of thousands or even millions of dollars in server fees before we could even begin developing the platform.
New: Quorum bills itself as the world’s most comprehensive database of legislative information. Where does all this data come from? Are there any less traditional data sources that you find valuable, but that your users might not expect?
Marks: Much of the data in Quorum comes from publicly available sources. Anyone is able to go on a member’s website and find their press releases, for example. What we did was build scraping algorithms to continuously and automatically pull all of the bills, votes, press releases, “Dear Colleague” letters, tweets, Facebook posts, floor statements, and more, from every member of Congress and every state legislator. We aggregate all of that information and run a series of analytics and natural language processing tools over it to provide valuable insights.
Many of our users have found the ability to search through tweets and Facebook posts from every member of Congress and state legislator incredibly useful because they can easily see how members are reacting to a hot-button issue. With this information, our users can then develop more targeted strategies.
New: Are there any data sources that you’d like to include in Quorum, but cannot because the data is not readily available or usable?
Marks: One thing we have noticed in our work is the complete lack of uniformity across state legislatures in terms of the type of data they collect and how they collect it. This made it challenging for us as we built out our states product as we had to build a separate system for each one.
Additionally, it would be great if the federal government released the data it provides to Congress.gov, which houses legislative data, as an application programming interface (API). Without an API, third parties like us have to go out of our way to scrape websites for this data.
Another data source we would like to look into is data from the Federal Election Commission (FEC), but we have had trouble getting started because it has been hard to match donors to donations with accurate results. FEC data quality is so poor that we would have to build a series of machine learning-based matching tools to build even the most basic database of campaign finance information. We’re looking forward to seeing what happens as the new openFEC API, which aims to make this data more usable, matures.