This week’s list of data news highlights covers July 5-11 and includes articles about a startup hoping to help doctors customize care based on patients’ genomes and a Rio de Janeiro effort to use mobile app data to improve urban traffic.
The Centers for Medicare and Medicaid Services certified its first “qualified entity” to have full access to national Medicare claims data. The Health Care Cost Institute (HCCI), which was approved this week under the Affordable Care Act, will be able to combine claims data with other payer data in order to conduct performance evaluations of health care providers and suppliers. Several major insurers have agreed to provide their cost information to HCCI as well. The combined Medicare data set, which contains information on over 100 million individuals, will be available under tight access controls, and HCCI eventually plans to provide a public portal with accurate cost information that has been cleansed of any personally identifiable information.
Former Merrill Lynch banker Noah Bulkin is using data to help turn around hundreds of struggling British pubs. Although about 10,000 such outlets have closed in the past decade, Bulkin believes the trend is not irreversible and can be bucked with granular data on prices, sales fluctuations, and customer drink preferences. Colleagues have built analytical models to predict potential sales and earnings for each of the 363 pubs Bulkin has purchased this year. These models have already shown that adding food, repairing old pool tables, and widening beer selection, can have a significant effect on sales.
Solar startup Sungevity is using laser beam data to help remotely determine what prices to offer solar panels to its customers. The company purchases the data from a third party, which comes from a radar-like device that uses laser beams instead of radio waves, and uses it to create a 3D model of properties and determine the optimal ways to place panels. “Soft costs”—that is, everything that is not hardware—account for about half the cost of solar systems, so Sungevity hopes that using this data to generate installation quotes will be cheaper than home visits.
This week, IBM released a 10-year plan for a project to improve China’s national energy systems to reduce air pollution. The company will begin the project by partnering with the government of Beijing, China’s most polluted city. The project, called “Green Horizon,” will integrate large-scale data analysis and connected sensors to model pollution levels, weather, and other environmental factors. In particular, it will use its artificial intelligence system Watson to predict pollutant quantities at the street level in Beijing 72 hours into the future.
Bay Area startup BaseHealth hopes to help doctors create personalized health plans based on their patients’ genomes. The company’s core product, Genophen, is a dashboard that integrates genomic data, personal health history, family history, and connected device data. In the future, the company plans to integrate electronic health records as well. Because BaseHealth provides their analysis to physicians first, and only gives patients access through their doctor, it steers clear of the Food and Drug Administration’s regulations governing how genomic data firms provide diagnosis information directly to patients.
Rio de Janeiro transportation officials are using data from the Waze and Moovit travel apps to help manage traffic. With automobile data from Waze and pedestrian data from Moovit, the city is trying to determine how people move through the city in as many modes as possible. The city hopes to soon integrate data from cycling app Strava to capture those travelers as well. The data will supplement road cameras and other traditional transportation department information with nearly 60,000 incidents reported each day. In return for sharing its data, Waze gets real-time sensor information from the city on highway usage.
Big data is becoming increasingly available to small businesses through software tools such as Desk.com’s customer feedback management system and mobile applications from database interface FileMaker, and operations from car washes to restaurants have reported benefitting from analytics. In 2013, 9.2 percent of small businesses were using business intelligence software, up from 1.7 percent in 2010. Part of the trend can be chalked up to the increasing number of software options, but growing smartphone and tablet penetration have helped enable many mobile applications as well.
San Diego transit officials are studying bus and trolley ridership to improve performance of the city’s Metropolitan Transit System. The city is working with Urban Insights, a big data services company that provides smart card and revenue management software, to predictively model transit flows using vehicle locations, payments, passenger counters, and route information. The city hopes to use the data to optimize routes, deploy police to curb fare beaters, and determine whether stops should offer more or fewer services.
Google has released findings on how to automatically spot online reputation builders, known as “crowdturfers.” These individuals, who promise customers that they can improve their social media reputations, undermine social networks’ ability to portray reputations accurately. The Google team built algorithmic models that could detect these tasks in one dataset with an accuracy of over 97 percent. The researchers hope their work will one day help social networks like Twitter ban bots and fake followers.
10. Australia Cracks Down on Tax Cheats with Data
The Australian Taxation Office (ATO) has announced that it will step up its data mining initiative to target offshore tax evaders. To date, the program has claimed AU$13 million in tax liabilities. As part of the new efforts, the ATO will collect information from overseas tax authorities, information on fund flows between Australian and foreign banks, interest and account balances, and money transfers, among other information. The agency will also use information from sources such as eBay and international stock exchanges.