10 Bits: The Data News Hot List
This week’s list of data news highlights covers February 8-14 and includes articles on the New York Times‘ first Chief Data Scientist and the profusion of sensing technologies being deployed at the 2014 Winter Olympics.
The New York Times has hired its first “chief data scientist” to help the publication maintain and expand its subscriber base. Chris Wiggins, the Columbia University applied mathematician who will fill the role, is tasked with mining the Times‘ considerable stores of user data and creating predictive models to better understand what behaviors indicate that a user might soon subscribe or unsubscribe. The Times‘ new position follows a similar one established at News Corp in October 2013.
Sensing technology is ubiquitous at the 2014 Winter Olympics in Sochi, Russia. One device, created by watchmaker and timekeeping company Omega, has been installed in all the competition’s bobsleds, transmitting data on speed, acceleration, G-force, and vertical track position to Olympic databases in real-time. Olympics organizers have almost fully automated recording competition results. The rise of sensors, along with increased mobile device use, has meant that Olympic organizers have had to bargain for a much higher bandwidth environment than that of the 2010 games.
The U.S. Department of Transportation is moving ahead with its Connected Vehicle project, which aims to improve highway safety by allowing vehicles to exchange data in real-time, but it has also begun researching an abundant source of in-vehicle sensors: smartphones. Crowdsourcing smartphone data could mitigate the expense of installing sensors in millions of vehicles, since the devices already carry accelerometers, GPS software, magnetic compasses and other sensors, the signals from which could be tapped and transmitted via Bluetooth to ease traffic and prevent accidents.
Insider threat detection for government agencies working with sensitive information can be addressed as a data problem. Predictive models have helped the Department of Defense determine the factors—such as a change in marital status or a trip abroad—that are the strongest indicators of possible threats. In addition, National Intelligence Director James Clapper recently described plans to put more intelligence community communications into a centralized cloud environment where data and individuals could be tagged to keep better track of interactions that could indicate an insider threat.
The French government recently released a new version of Data.gouv.fr, the nation’s open data portal. The new site encourages contributions by showing those who upload data how others are using their data. The site is not only for government data. Anyone, even if they do not work for the government, can contribute a data set.
Geospatial systems corporation Esri has announced that it will soon enable government users of its widely-used ArcGIS mapping software to release their data openly without additional cost. Esri’s software, which stores geospatial data in a proprietary file format, has presented obstacles to data openness in the past. The new feature will offer dedicated hosting and conversion to a variety of open formats.
The New York State Supreme Court ruled last week that the state’s Education Department was within its rights to submit identifiable student data to a third party to assist in creating a statewide database. The ruling came after a group of parents sued the department, objecting to the state’s participation in inBloom, a Gates Foundation-funded education data consolidation nonprofit. The court ruled that the data transfer in question was made with a legitimate purpose.
Researchers working with the U.S. Intelligence Advanced Research Projects Agency have demonstrated the ability to predict disease outbreaks in Latin America by tracking social media and web searches. The goal of the program is to inform U.S. policymakers about major events with enough lead time to make a difference with interventions.
Large scale data analysis has demonstrated lifesaving and cost-cutting insights in a broad range of fields, but some topics are just beginning to be addressed with “big data” technologies. One such area is human resources; Boston-based startup Evolv has drawn a number of interesting conclusions from its millions of pieces of human resources data, including the fact that the choice of “nonstandard” browsers like Firefox and Chrome is a powerful predictor of employee performance, and that employees with criminal backgrounds are 1 to 1.5 percent more productive than people without criminal records. Evolv reasons that companies could save $10 million per year by hiring more people with criminal records.
This week, the Facebook Data Science Team released the results of some Valentine’s Day-themed research that looked at age differences in opposite-sex relationships. The researchers found that, in these relationships, male partners are an average of 2.4 years older than their female partners, and that males are older in 67 percent of opposite-sex relationships. Arab countries tend to have larger age gaps, while Australia was the country closest to age parity.
Photo: The U.S. Army