10 Bits: the Data News Hotlist
This week’s list of data news highlights covers March 11-17. 2017, and includes articles about a machine learning tool that helps estimate a roof’s solar potential, and a new algorithm that can encode JPEG’s without reducing image quality.
The U.S. Defense Advanced Research Projects Agency (DARPA) has launched 13 research projects to study how to develop machine-learning systems that can explain the decisions they make. While machine learning can be very effective for identifying patterns in data, understanding how a system makes these decisions can be difficult, if not impossible, due to the complexity of these systems. The projects all take different approaches to making AI more explainable, including developing interfaces that can use natural language and data visualizations to explain different decisions.
Researchers at DeepMind and Imperial College London have developed an algorithm called elastic weight consolidation (EWC) that enables neural networks to retain information they learn and apply this knowledge in new contexts. DeepMind had developed machine learning systems that could play simple video games, however each system could only learn how to play one game. The algorithm allows these systems to learn to play one game and then apply what it learned in that training to play several other games. EWC works by analyze what a machine learning system learned while training to play a game and identifying the most useful parts and then applying these parts as the system learns to play a new game.
A startup called Recall Masters has turned to machine learning to analyze car recall data and identify recalled cars’ current owners to help dealerships notify them. Though transportation authorities can issue recalls, the federal government does not track how many vehicles get recalled or how dealerships that sold recalled vehicles track down owners to inform them of the recall. Recall Masters compiles data from 50 different sources to determine if a vehicle has been recalled and analyzes transaction records to identify current owners, even when vehicles have been resold.
Google has updated its Project Sunroof, which uses satellite imagery and weather data to calculate the amount of sunlight a house’s roof receives over a year, to use machine learning to construct 3D models of approximately 60 million buildings throughout the United States and predict their potential for solar power. Users can enter in an address for a building and, if Project Sunroof has modeled it, view how viable solar panels would be and calculate the cost savings of panels for their energy bills.
Researchers at Mount Sinai Hospital have completed a study of asthma patients by collecting data through Apple’s ResearchKit platform, which allows researchers to gather user-generated health data from iPhones. The study required participants to periodically answer surveys about their asthma symptoms. Researchers linked this information with users’ geolocation data, captured by ResearchKit, and other factors, such as air quality data, to understand the relationship between daily asthma symptoms and changes in the environment. Over the six-month study, 2,317 participants contributed enough reliable data for the study to be successful as well as prove the viability of smartphone-based data collection for medical research.
Google has developed a new algorithm named Guetzli for encoding images in the JPEG file format that reduces their file size by 35 percent without reducing image quality. JPEG encoding involves reducing the unorganized data of an original image into easily compressible, ordered data, however this usually reduces the level of detail in the image. Guetzli uses a model for organizing this data based on the way humans process images, allowing it to sacrifice more details without significantly reducing the perceived quality of an image.
Researchers at New York University are developing an AI system that allows autonomous vehicles to pinpoint their locations on maps based on sensor data about their immediate surroundings. While autonomous vehicles can use GPS to learn their location, its lack of precision makes it challenging for them to automatically share their location with other autonomous vehicles nearby. The system will link data from onboard sensors with mapping data from a cloud-based mapping service called HERE HD Live Map in real-time to determine a vehicle’s location on a map with an accuracy of 10 centimeters. This approach could help autonomous vehicles better avoid each other on roads, as well as accurately map road conditions in real-time and share this information with other vehicles.
Binghamton University researchers have developed the Networked Pattern Recognition Framework (NEPAR) which uses data mining and network analytics to analyze historical data about suicide bombings and predict future attacks. The researchers had NEPAR analyze data about 150,000 terrorist attacks in Iraq from 1970 through 2015 and attempt to identify patterns in their characteristics, such as their timing, location, targets, and security measures around their targets. When exposed to new historical data, NEPAR was able to predict the characteristics of a future attack with over 90 percent accuracy.
A team at Florida International University have developed a system called Fairplay that analyzes apps on the Google Play store to identify signs of malicious behavior. Unlike other approaches to malware detection that analyze an app’s code, Fairplay identifies user accounts that post fraudulent reviews of malicious apps to boost their rankings, as these accounts tend to post reviews for multiple hostile apps. The team used machine learning to train Fairplay on fraudulent reviews of 200 apps known to contain malware. After analyzing 90,000 newly published apps, the team discovered hundreds of malicious apps that were able to escape Google’s screening.
Scientists from Rensselaer Polytechnic Institute have developed a blood test that could aid autism diagnoses by using an algorithm to identify chemical markers of the disorder. Children on the autism spectrum have subtle differences in their bodies’ metabolic pathways. The algorithm analyzes 24 metabolites in a blood sample to determine if they are altered in a manner indicating they came from a child with autism. In a test of 83 children already diagnosed with autism, the algorithm was able to correctly identify the condition with 98 percent accuracy.
Image: Nellis Air Force Base.