The Center for Data Innovation spoke with Adam Bonica, cofounder of Crowdpac, a non-partisan company that focuses on educating voters with objective assessments of political candidates, based in Palo Alto, California. Bonica discussed the value of analyzing data outside of a candidate’s voting record, as well as how politics has become more reliant on data.
Joshua New: Crowdpac calls itself the definitive resource for “objective data” on U.S. politicians. Why does this make Crowdpac differ from, say, a news program?
Adam Bonica: I think the operative word here is “data” rather than “information.” We’re trying to take a numerical approach when we analyze candidates. There’s been this big revolution in campaigning where candidates and parties have tons of data at their disposal that allows them to learn a ton about voters and their preferences, mobilize supporters, and target their messages. We’re trying to flip that equation around and use the huge amount of data on politicians in the hands of voters. Voters want to know what candidates are about and how they would behave in office if elected, and we want to give them this resource.
New: Scoring candidates based on how liberal or conservative their records are is not exactly new. What makes Crowdpac stand out?
Bonica: My dissertation advisors were actually pioneers in the field of statistically ranking politicians based on their ideologies, and even before that you had interest group ratings. These are all very useful, and we have a lot of this style of approach in our model. But there are a couple of big differences with Crowdpac. The first is you can only really rank someone like this after they’ve been in office. You could only evaluate someone’s political positions after they’ve already been voting for a while. So if you’re voting in a primary and you’re looking at four different candidates and all you have is a name on a ballot, those rankings don’t do you much good if these aren’t incumbents.
We’re also bringing in a lot more information that just voting records. Legislators are pretty wise to the idea that people are watching how they vote, but another useful source of information on their political preferences is the research on the supporters that cut these candidates checks. Campaign finance data is hugely valuable data that complements voting records in an interesting way, so we include that in our model.
New: Speech is the other big category of data Crowdpac uses in its models, in addition to money and voting. Can speech be subjective and nuanced? How do you reliably quantify political rhetoric for these models?
Bonica: For speech, we actually focus on topic modeling. We look at all the legislation that has passed through Congress or state legislatures that’s coded by the Congressional Research Service based on topic. With this information, we can determine if a bill is, for example, 90 percent about guns and 10 percent about crime. And for candidates, we can determine things like if they talk about immigration substantially more than their peers. This helps us get an idea of what a candidate’s priorities are based on the things they say and the text they produce.
It also helps us a little bit in ranking people from left to right on the political spectrum. For example, if you say “undocumented worker” rather than “illegal immigrant,” there’s some data in those phrases that could indicate political leaning.
Someday we hope to have a high-powered natural language processing framework, but that’s probably not in the works for quite a while.
New: Are there any candidates that you have analyzed that have traditionally presented themselves as liberal or conservative, but that your models show otherwise?
Bonica: I’m probably slightly biased because I’m more focused on what’s presented in the data, rather than how candidates present themselves. But one good example that came out of the 2014 Senate elections was Greg Orman, who ran in Kansas. He portrayed himself as an independent who could have gone either way—Republican or Democrat—but all of his supporters and all of his past donation records pointed to him being quite liberal. We never got to observe him in Congress so maybe he would have done some pandering, but the best information you could get about how he was going to act indicated that he probably would have voted along party lines for all the major issues.
With Angus King, we saw the same thing—people were wondering if he was going to caucus with Republicans or Democrats. But when you looked at his personal donations it was pretty clear he was going to go with the Democrats. So there are these cases when you have independents running when you can get a pretty good idea of where they’re going to end up.
New: In addition to helping run Crowdpac, you are an assistant professor of political science at Stanford University. Has there been any substantial shift in the role objective data plays in politics in the past several elections? Has it become more valued? Less?
Bonica: I think people are definitely paying more attention to data than they have been before, and it’s coalescing in journalism. Its the 538 effect—people have a lot more interest in what the data can say about certain stories. In politics in particular, there are a lot of different perspectives that are spoken pretty convincingly, and it’s useful to have data that can objectively communicate what’s going on. I think this has been happening for a long time, but it has accelerated in recent years because there’s so much more data and so many more tools to analyze it all.
More generally, data has become a really big deal for campaigns. They are very interested in how they can get data from potential supporters and this trend started back with Karl Rove when they started to do direct mail. Computers and data make direct mail strategies dramatically more efficient than they’ve been in the past. Now you see books written about Obama’s tech team and how they did such a good job using this data.
I don’t think there’s been quite as much effort in the other direction, in terms of voters getting data from candidates. We have a very robust sunshine and disclosure regime in the United States where a lot of what candidates do is put on public record with the intention of being useful to voters. But if you give a voter a list of donors, they probably can’t get anything out of it unless they’re super in-tune with politics, and even then it’s pretty hard to decipher. The hope is that making this data accessible and digestible in an objective way will encourage voters to look at this data a little more deeply. That’s where we’re trying to push things. There’s a ton of great work going on on the disclosure side, and we want to take it that extra step. The data should be just as useful to voters as it is to politicians.