10 Bits: The Data News Hot List
This week’s list of data news highlights covers November 30-December 6 and includes articles on a new White House open government plan and a paper that models Hollywood films’ “zombie apocalypses” using epidemiological methods.
The White House released its second “Open Government National Action Plan” this week. The first, released in 2011, set goals for a variety of transparency and open data initiatives, and the new plan announces 23 new or expanded commitments to advance the efforts further. The new initiatives include improvements to the White House’s “We the People” online petition platform, modernization of the Freedom of Information Act and a pilot program for participatory budgeting.
A new study used data from horror movies to model the portrayal of zombie infections, finding that there were two distinct types of such epidemics. In one “strain,” all people who die become zombies, while in the other, only individuals who come into contact with a zombie succumb. These two types produce starkly different epidemiological dynamics, and the paper’s authors contend that their work sheds light on real-life disease models, even if their research subject is fictional.
Provisions in the Health Insurance Portability and Accountability Act (HIPAA) make it excessively difficult to use data analytics to improve patient care analytics, according to a report published this week by the Bipartisan Policy Center. Despite the growth in collection of health data, most of it is not being used to improve health outcomes or reduce costs. In addition, existing privacy rules affect how patient data is collected resulting in biased datasets.
Two startups have recently entered the space of snow sports data mining. Liftopia, founded in 2006 as an online lift ticket seller, now offers analytics and dynamic pricing to ski resorts. AlpineRelay, launched in 2011, offers a smartphone app that tracks skiers’ speed, distance, airtime and other metrics. AlpineRelay will soon begin offering a mobile sensing unit that attaches to skis or snowboards to give even more granular readings. The ski resort industry has so far been slower to adopt these technologies, but both companies have secured venture funding that may help bring the resorts around.
Global pharmaceutical company Pfizer announced this week that it will broaden access to its clinical trial data to include independent researchers and patients who take part in its studies. The company, which joins British firm GlaxoSmithKline in recent data-sharing efforts, will provide anonymized information on its trial subjects, as well as offer individual patients access to their own data.
Data scientists working in areas such as sports analytics have developed sophisticated metrics to rank players, but the same tactics do not work so well when applied to social influence. Historical importance rankings, which are typically qualitative, do not lend themselves to easy quantification, not least because few can agree on objective measures of historical importance. Nevertheless, several efforts have recently emerged to quantify influence and distinguish it from mere fame; one applies Google’s famous PageRank algorithm to links in the English-language Wikipedia.
Email marketing utility MailChimp is known for offering users a variety of analytics to track their campaigns, but it may be less apparent that the company relies on data science internally as well. At a conference this week, a MailChimp data scientist recounted a story from within the company: resources were being wasted managing employees’ schedules, until he endeavored to treat the schedules as a mathematical optimization problem, ultimately finding an “optimal schedule” for the work. The company has also leveraged its enormous email address database to help identify and weed out spammers.
Transportation startup Lyft offers an app that lets users order rides from strangers, and it is banking on data analytics to ensure that it can do the job as quickly as possible. Using traffic and map data, Lyft’s data scientists can determine exactly how long it will take a driver to reach a given destination, which is useful in giving customers estimated trip times. Another data science application is how to decide between multiple available cars near the user. Lyft analyzes the cars’ positions and picks the one that is least likely to get called by another user.
The Knight Commission on Intercollegiate Athletics released an interactive database this week that compares spending on athletics with academic spending among public Division I institutions. The commission hopes the database will aid university administrators and policymakers in balancing academic and athletic spending. The underlying data is open and freely available for download.
Behavioral data startup RevolutionCredit offers a sort of “traffic school” for money management, in which individuals can watch educational videos and take short quizzes, their commitment to which influences an algorithm that estimates their credit score. The company’s founder doesn’t see the tool as a replacement for traditional credit scores, merely a booster, and one which could reward individuals who might not look so good on paper but who are willing to inconvenience themselves in the short-term for a better long-term financial position.
Photo: Flickr user Rodolpho Reis