10 Bits: the Data News Hotlist
This week’s list of data news highlights covers February 1-17, 2017 and includes articles about a new database that can help recover stolen artwork and a machine learning system that can detect signs of austim in brain scans.
Los Angeles startup ZestFinance has developed a machine learning platform called ZAML that allows lenders to assess the credit risk of individuals who lack sufficient credit history to have traditional credit scores. ZestFinance trained ZAML on search data from Baidu to determine how a person’s Internet activity could reveal credit risk factors, such as whether or not a person regularly visits online gambling websites, and assesses 100,000 different data points to predict if a person is a good credit risk. In a test using ZAML, Baidu, which runs a lending business, was able to approve 150 percent more borrowers than using traditional methods without an increase in losses. ZestFinance also uses a machine learning algorithm developed by the U.S. Consumer Financial Protection Bureau to analyze ZAML’s predictions to identify any signs of potential bias.
Penn Medicine in Philadelphia is piloting a method for analyzing health data from patients with lung cancer to predict the risk that a patient will have to return to the emergency room. The method involves analyzing historical data about a patient’s previous visits to the emergency room to identify a potential underlying cause for the repeated visits, then modeling the likelihood that this issue will cause a patient to return in the future. The pilot is focusing on lung cancer patients because they visit the emergency room more frequently than patients with other forms of cancer, and lung cancer patients often have other serious medical conditions that could result in complications, such as heart disease or diabetes.
A nongovernmental organization called the Bosnian Centre Against Trafficking in Works of Art has developed a database for stolen and lost artworks to help authorities crack down on illicit art trafficking. During the Bosnian War, from 1992 to 1995, thousands of works of art were stolen, however Interpol has only registered 12 of these cases The database will consist of reports of stolen artworks, provided by galleries and private collectors that have been robbed, and will circulate information about the pieces to border officials to make it easier for them to identify when people are smuggling stolen art.
Neurologists at the University of North Carolina have developed a deep learning system capable of identifying indicators of autism spectrum disorder in the brain scans of infants as young as six months old, well before the age infants are typically diagnosed. The neurologists had their system analyze brain scans of 148 children—106 of which had a high risk of autism based on family history. It was able to detect signs of increased brain volume, which is an indicator a child will likely develop autism, and predict which children would develop autism with 81 percent accuracy.
The University of Illinois at Urbana-Champaign’s National Center for Supercomputing Applications (NCSA) has partnered with the Illinois Department of Innovation and Technology (DoIT) to use NCSA’s supercomputing resources to develop methods for securing and applying state-collected data about citizens, such as health information and businesses licenses. After NCSA helps DoIT secure this data, it will focus on improving how DoIT curates and uses this information, as well as help DoIT protect critical infrastructure and respond to potential threats.
Researchers at Indiana University-Purdue University Indianapolis have developed a machine learning algorithm that can predict which patients with acute myelogenous leukemia (AML) will remain in remission after receiving treatment and which will relapse. The researchers trained the algorithm on historical medical data from AML patients, as well as blood data from healthy people. In a test the algorithm predicted which patients would remain in remission with 100 percent accuracy and which patients would relapse with 90 percent accuracy.
A team of seismologists at Columbia University are attempting to develop a method to predict when earthquakes will occur by using machine learning to process large amounts of raw seismic data. Many researchers have tried to develop earthquake prediction methods in the past, but all have been unsuccessful. The seismologists believe that thanks to increases in computing power and the sophistication of AI, a machine-learning approach could help detect patterns that precede earthquakes that previous methods could not. In simulations, the machine learning method was able to identify a reliable acoustic pattern that changes based on tectonic activity that the seismologists could use to identify a narrow window of time during which an earthquake could strike.
IBM’s Weather Company has developed a new smartphone app capable of disseminating emergency warnings about natural disasters and dangerous weather with other users even when cellular networks are down. The app relies on mesh networking to share weather data, using smartphones’ Bluetooth and WiFi antennas to automatically transfer warnings to nearby phones, meaning the app only needs one working data connection to download the alert to spread the data to a large network. IBM designed the app for developing countries where wireless infrastructure might be less resilient and plans to launch it initially in India.
Microsoft has published software that generates realistic environments for the purpose of training autonomous drones and vehicles, called the Aerial Informatics And Robotics Platform (AirSim), as open source, allowing researchers to freely use the software to train their own systems. AirSim is capable of generating traditionally difficult-to-simulate graphics, including reflections, shadows, and wet surfaces, to help train computer vision algorithms to interpret their surroundings in realistic environments in real time.
Researchers from the University of Pennsylvania and the nonprofit Center for Community Progress have developed an algorithmic system that can accurately predict the spread of gentrification in a city. The researchers used data from the U.S. Census on 29 cities from 1990 through 2000 and had the system analyze factors about census tracts including average income, housing supply, and median housing costs, and compare these data points to adjacent census tracts, to predict how housing costs would rise or fall by 2010. The system was successfully able to predict where and how quickly gentrification occurred, and the researchers were able to map how gentrification would spread through 2020.