10 Bits: the Data News Hotlist
This week’s list of data news highlights covers August 13-19, 2016 and includes articles about researchers using big data to detect outbreaks of foodborne illness and the Nicaraguan government connecting an active volcano to the Internet of Things.
Uber has announced that it will launch a fleet of self-driving cars in Pittsburgh by the end of the month that users will be able to request through the Uber app. The self-driving cars, specially modified Volvo’s outfitted with an array of cameras and sensors, are fully autonomous but will have a human sitting in the driver’s seat of the car to comply with the law. The Uber app will pair Pittsburgh users with a self-driving car at random when they request a ride, and self-driving trips will be free for the initial launch.
AI research nonprofit OpenAI is training its machine learning system to understand language by having it analyze comment threats from the popular social media site Reddit. The researchers are using deep learning techniques to build probabilistic models of language, which could help AI programs understand language well enough to interpret and participate in conversations. Developing AI that can reliably understand language is particularly challenging due to how complicated and nuanced language can be, but by using deep learning techniques and high-performance computing to analyze huge amounts of examples, the researchers expect to be able to eventually teach their system to hold conversations.
Researchers at IBM have developed a method for analyzing retail scanner data from grocery stores that could be used after an outbreak of foodborne illness to rapidly identify potentially dangerous food items. The researchers tested their system on historical examples of outbreaks, such as the 2011 outbreak of E. coli in Europe which killed 50 people and caused €150 million in damages during the 60 days it took to identify the contaminated food, and compared grocery store scanner data with the locations of reported illnesses. They found that they could narrow down the responsible food item to a small number of potential candidates after just a few hours. Additionally, the researchers’ technique could help authorities proactively identify likely sources of foodborne illness before an outbreak even happens to be better prepared.
The Australian Transaction Reports and Analysis Centre (AUSTRAC), a government agency that investigates suspicious financial activity, has partnered with the Royal Melbourne Institute of Technology to develop a machine learning system that can analyze millions of financial transactions and flag suspected criminal activity for further investigation. AUSTRAC is responsible for sorting through up to 100 million transactions per year to identify money laundering, fraud, and other crimes, which can be incredibly time and resource-intensive. In a test, AUSTRAC was able to use the machine learning system to identify suspicious transaction patterns and reduce the number of transactions that warranted additional human investigation to just 750,000 cases. AUSTRAC plans to use the system for investigations starting next year.
The Nicaraguan government has partnered with General Electric (GE) to install a network of sensors on the Masaya volcano to help collect valuable data on the active volcano, which is just 12 miles away from Managua. A team is installing over 80 connected sensors on the side of the volcano to monitor gas emissions, atmospheric pressure, temperature, and how magma is moving inside the volcano. These sensors will be able to power an early warning system for potentially dangerous volcanic activity, as well as provide researchers with a wealth of valuable scientific data.
Audi has begun equipping some of its models sold in the United States with technology that can communicate with traffic signals, known as vehicle-to-infrastructure (V2I) communication, marking the first commercial deployment of the technology in the country. When a V2I equipped car approaches a traffic light, screens on the dashboard will warn drivers that they will be unable to make an upcoming green light and to begin braking, as well as display a countdown timer for red lights indicating when they will turn green. Audi plans to help five to seven undisclosed cities to enable the infrastructure end of the V2I technology this year.
Researchers from Stanford University have developed a machine learning system that can accurately predict where people live in poverty by analyzing survey data and satellite imagery. The researchers first had their system analyze household survey data from five African nations and then distinguish between well- and poorly-lit areas from nighttime satellite photos. Then, by analyzing daytime photos, the system was able to identify characteristics correlated with poverty, such as the concentration of roads and urban areas, and create accurate maps of impoverished areas. The researchers plan to use their system to develop a publicly available global poverty map to help inform development efforts.
The U.S. National Oceanic and Atmospheric Administration (NOAA) has developed a new forecasting model called the National Water Model that can help predict how rainfall and ocean forecasts will influence flooding in more than 2.7 million locations throughout the United States— 700 times more locations than previous models could forecast. NOAA developed the National Water Model with the help of its new Cray XC40 supercomputer that can simulate water flow across the entire country’s streams and rivers every hour, allowing NOAA to generate accurate forecasts up to 30 days ahead of time.
NASA has announced that all NASA-funded research must be freely available to the public. Under the new rule, scientists publishing NASA-funded research must upload their papers to a portal managed by the National Institutes of Health called PubSpace within a year of publication, and in their funding proposals, must commit to providing long-term access to their underlying scientific data in a digital format.
Neuroscientists at Cold Spring Harbor Laboratory, a private research institute, have developed a method for mapping the individual connections between neurons in the brain, known as the connectome. Mapping the connectome usually entails relying on labor-intensive methods that are not well-suited for mapping many connections simultaneously. The new method, called MAP-seq, instead relies on analyzing unique strings of genetic code sequences injected into the brain that can serve as a unique identifier for each neuron. Using a DNA sequencer, researchers can then plot the connections between each identified neuron and analyze how these connections relate to the brain’s function. MAP-seq could be useful for studying disorders such as autism and schizophrenia, which researchers believe arise from dysfunctional connectivity in the brain.
Image: Ryan Ballantyne.