10 Bits: The Data News Hotlist
This week’s list of data news highlights covers July 27-August 2 and includes articles on using machine learning to fight gambling addiction and a data-driven football training regimen.
Featurespace, a UK firm based on a University of Cambridge engineering project, is using online gambling website data and machine learning techniques to identify the onset of gambling addiction. In addition to its social costs, gambling addiction has commercial costs, according to the founders; a long-term customer with healthy habits is preferable to a customer who might be spending a lot in a short period of time only to withdraw when they acknowledge they have a problem.
The New York City Fire Department (FDNY) has implemented the Risk Based Inspection System, which mines information from various city agency databases to prioritize building inspections. The effort, an offshoot of work done by the Mayor’s Office of Data Analytics, uses criteria such as building material, type of building, height, age and occupancy to determine which sites are at highest risk.
Pharma giant GlaxoSmithKline has announced that it will release anonymized data on clinical trial participants for use by independent researchers. The data, which is expected to comprise 400 clinical trials by the end of the year, will not be available in a raw format, but will be available for querying in standard statistical software such as R and SAS. The effort is in response to growing demand for external investigators to be able to conduct independent reviews of drug safety.
Yehud, Israel-based TaKaDu provides analytics software for municipal water management, processing data from sensors and meters as well as water usage patterns to model the relative health of different parts of a water network. With water lost through leakage amounting to up to 50% of some water systems’ total supply, careful prioritizing of leak response can save utility companies hundreds of thousands of dollars per year. The system faces challenges in countries such as India, where water networks can be spread out over wide areas and data must be collected manually, and smart irrigation systems will need to rise to the challenge as global demand for water efficiency increases.
The Open Data Institute (ODI), a London-based company, helps public and private organizations make use of the growing pool of open data. Part incubator, part consulting firm, part research organization, ODI’s founders say it serves the dual purpose of creating social and economic value. The firm hopes to create localized ODIs in various countries, and has even had exploratory talks with the White House about launching an ODI America.
DNA sequencing is rapidly becoming an everyday tool in biomedical research, due to its decreasing costs and applications in the diagnosis and treatment of many conditions, but its computing burden is increasing. Gene sequencing technology is improving at a faster rate than computers are, but a spate of complex algorithms and cloud-based processing solutions offers some hope.
The Philadelphia Eagles football team’s new data-driven training strategy seeks to track players’ performance with a variety of sensors and other data collection techniques. The coaching staff will measure reaction times, recovery speed and other metrics to analyze player health. The coaches hope the insights generated will contribute to training regimens.
The Knight Foundation News Challenge will award over $2 million in funding for innovative uses for health data, the foundation announced. The challenge, which begins August 19, will be undertaken in partnership with several health-related foundations.
Patients can benefit from access to at-home measurements of health indicators such as blood pressure, glucose levels and weight. Boston’s Partners Healthcare recently launched a system that allows patients to upload this information from their devices into an electronic health record database for their doctors to consult in hopes of informing treatment decisions. One study found a significant decrease in blood pressure among participants who uploaded readings to a progress-monitoring web application.
The digital divide may be skewing datasets collected from the internet. People with lower levels of income and education tend to have less online presence, meaning that they generate less data online than wealthier, better-educated people do. As roughly two-thirds of people in the world are not online, this could represent a significant skew, and it is possible that the skew could produce significantly inaccurate results for large-scale data-analysis, particularly in the fields of economic and social research.