10 Bits: the Data News Hotlist
This week’s list of data news highlights covers July 23-29, 2016 and includes articles about an algorithm that can detect online abuse and a new comptuer model that could help locate Malaysia Airlines Flight MH370.
The Australian government has announced it will develop analytics-driven programs to reduce welfare dependency for targeted groups based on research finding that it is possible to predict small groups of people likely to become welfare dependent based on particular risk factors. The government will develop the programs with support from its new “Try, Test, and Learn” fund, which provides out A$96.1 million (US $73 million) to fund innovative, evidence-based approaches to reducing welfare dependency.
Researchers at Yahoo have developed an algorithm capable of detecting abusive messages online with 90 percent accuracy, better than any similar system to date. The researchers created a training data set of abusive messages compiled from comments on Yahoo articles flagged as offensive, and applied traditional analytical techniques such as keyword detection, as well as more complicated semantic analysis techniques to train their algorithm to detect complex abusive language.
Twelve hospitals in California are piloting a program to report cancer diagnoses in real-time to the California Cancer Registry, a statewide database containing 4.5 million cancer patient records, using standardized electronic forms. Physicians and researchers use the registry to analyze treatment effectiveness and better serve patients, but the data can be up to two years old and nonstandardized, limiting the level of insight it can provide. By reporting this data in real-time, health-care providers will be able to make more informed decisions, and state and federal health agencies will be able to develop better disease surveillance statistics and improve cancer care.
The Bank of England is using algorithmic matching techniques similar to those used in online dating services to piece together detailed data about the economy. Online dating algorithms sort through large numbers of users to match potentially suitable partners based on compatible factors. Using this approach, the Bank of England is sorting through huge amounts of housing market data to match home loans with factors that indicate why people move, providing insight into the relationship between the housing market and the financial crisis of 2008.
The U.S. Office of Management and Budget (OMB) has updated Circular A-130, “Managing Information as a Strategic Resource,” a document which describes how federal agencies are required to manage data. This is the first update to Circular A-130 since 2000, and includes a number of provisions to modernize government information management, mainly real-time monitoring, risk management, and accountability. Circular A-130 also adopts many of the open data policies implemented by the federal government since the last revision, particular President Obama’s 2013 executive order making open and machine readable the default for government data.
After finding a crack in a critical component of 120 of its train cars, the Southeastern Pennsylvania Transportation Authority (SEPTA) is installing sensor hubs in one of the cars that can monitor the stress the car experiences during routine travel and help SEPTA make the necessary repairs. The sensors will monitor the component as the car makes round trips with thousands of pounds of sandbags on board to simulate the weight of a full car of passengers, and SEPTA will use this data to map out particular track conditions that cause the most stress to the component, such as frequent stops or high speeds, to determine the best way to repair it.
Researchers at the Euro-Mediterranean Centre on Climate Change, a non-profit research institution in Italy, has developed computer models with oceanographic data that could help reveal the location of Malaysia Airlines flight MH370, which disappeared in March 2014. The researchers geographic data about where debris has been found and two years of data from the European Union’s Copernicus Marine Environment Monitoring Service, including ocean current data and wind patterns, to simulate where the debris could have originated from to end up where it did. According to the model, the wreckage of the plane is likely between the coordinates of 28°S and 35°S in the Indian Ocean, but could also be several hundred miles further north than most search efforts have considered.
The U.S. National Weather Service (NWS) has selected the next technology that will serve as the core of its global forecast system (GFS), which analyzes large amounts of data on global weather conditions to create high-level weather models. The new core of GFS, developed by a National Oceanic and Atmospheric Administration research lab, will be able to extend the accuracy of its forecasts from 8 to 10 days, as well as improve hurricane forecasts and provide warnings of extreme weather events up to four weeks in advance. NWS will now begin developing the next generation GFS over the next three years.
The Oklahoma Department of Human Services (DHS) is piloting a program that uses predictive analytics software called Eckerd Rapid Safety Feedback (ERSF) to analyze data about a child’s well being from disparate sources that could indicate a child is at high risk of being in danger. ERSF uses machine learning techniques to identify risk factors correlated to bad outcomes, such as having parents with a history of substance abuse or mental health problems, and flag cases with a high number of these factors for further investigation and potential intervention. DHS is testing the system in Oklahoma County and hopes to expand it to statewide use if the pilot is successful.
Researchers at the University of Washington in Seattle have developed a technique for harvesting Bluetooth radio signals and rebroadcasting them as Wi-Fi signals, which could allow embedded connected devices, such as glucose-monitoring contact lenses for people with diabetes, to wirelessly and efficiently transmit data. The researchers developed a prototype smart contact lens as well as a prototype brainwave monitor that could be embedded under the skull to demonstrate their technique, both of which were capable of communicating with a nearby smartphone. The prototypes can only transmit data at a rate of 160 kilobytes per second, and the corresponding smartphone must be within 24 inches, but the researchers believe both of these limitations could be significantly improved now that they have demonstrated its feasibility.
Image: Laurent ERRERA.