10 Bits: The Data News Hot List
This week’s data news hot list spans stories from data standardization efforts in “smart” airports to the digital music industry’s efforts to streamline data-driven accounting.
The United Nations’ Global Pulse initiative focuses on harnessing data in new ways to apply technology and data to economic development and humanitarian aid. Applying analytics techniques and technologies developed for online advertising, Global Pulse aims to bring real-time monitoring and prediction to development programs. Using content-aware text mining, they have used Twitter data as a proxy for economic indicators such as poverty.
Sydney Airport, Australia’s largest, has experimented with using data to model passenger flows and plan new services. Using IBM tools, the airport is striving to move away from the Microsoft Excel-based data analysis methods of its past and leverage modern data science to improve its business. The airport, which once had “passenger” data collected using seven separate definitions, has had to encourage data standardization efforts to enable advanced analysis, but has already seen a sevenfold improvement in the speed of standard operations reporting.
Hospitals and health insurers are using large datasets to improve care, analyze effectiveness of different treatments and reduce hospital readmissions. These initiatives are being spurred largely by Medicare, which will begin to impose penalties on hospitals that don’t improve outcomes in the areas of electronic health record management, hospital readmission rates and hospital-acquired conditions.
State Farm Mutual Automobile Insurance Co.’s “Drive Safe & Save” program lets drivers opt into providing the company with data streams of driving variables such as speed, miles driven, acceleration and hardness of turns. In return, drivers who exhibit safe behaviors are eligible for lower insurance rates; those who don’t drive as safely are not penalized. “Usage-based insurance,” as it has been dubbed, could enable insurance companies conduct much more granular analysis for customizing its rates than traditional demographic data offers.
The Chinese government is working with IBM to develop a hyper-granular wind-prediction technology, called Hybrid Renewable Energy Forecasting (HyRef). The forecasts produced by the technology, which relies on wind turbine sensors paired with weather data and analyzed in small supercomputers, can be updated every fifteen minutes. Wind farms, which have widely variable output, have been difficult to forecast accurately, leading to inefficiencies in the sale of the resulting power.
Civilian police in the UK have begun deploying a crime investigation software first developed for the Royal Military Police to process data on abuse by British soldiers in Iraq. The software, which has been applied in child abuse, hate crimes and computer hacking cases, is reported to have cut the cost of investigations significantly. Due to the recurrence of illicit material in cases such as those involving child pornography, the software can often recognize known materials automatically, without even needing sophisticated analysis.
The Obama administration’s efforts to sign up millions of uninsured Americans have found sophisticated data-driven support through the efforts of data scientist Matt Saniie. Saniie, who worked in the “Data Cave” during President Obama’s 2012 presidential campaign, has helped develop a model to predict whether an individual has insurance, with 99% accuracy. In hopes of getting 2.7 million healthy 18- to 34-year-olds to sign up (enough to offset the costs of the predicted enrollment by people with previously existing conditions), the administration is relying on a quantitative analysis and social media-driven approach.
The Australian government’s Department of Finance and Deregulation has released a new report on “big data strategy,” in which it details a variety of best practices for other agencies in devising and deploying data science initiatives. The report is intended to help agencies make better use of their data assets, but makes prominent commitments to preserving the privacy of individuals represented in the data. The report strives to establish the Australian government as “a world leader in the use of big data analytics to drive efficiency… in the public sector.”
A new study shows that crowdsourced non-experts performed as well as experts in a remote sensing effort to identify land usage patterns. Researchers at the International Institute for Applied Systems Analysis led a team of citizen scientists in examining satellite data to document areas where people live and farm; in particular, non-experts were as good as experts in identifying areas exhibiting “human impact.” Experts still had an edge in identifying specific land-cover types.
Digital music distributor Rebeat Digital has endeavored to apply data science to the business of royalty accounting, which suffers from inefficiencies brought about by the intersection of massive datasets with legacy software and a lack of data standardization. The company’s latest offering, Rebeat Business Edition, proposes to streamline the data munging part of the accounting process, thereby saving record companies considerable time and money.