10 Bits: The Data News Hot List
This week’s list of data news highlights covers June 28-July 4 and includes articles about a startup hoping to transform public transit in Boston and beyond, and an Associated Press initiative to use algorithms to write certain financial articles.
Cambridge, Massachusetts-based startup Bridgj is using big data to create on-demand bus routes. Drawing data from social media, municipal records, the Census, and other sources, Bridj uses a proprietary algorithm to infer where there might be high demand for transportation service. In principle, Bridj could also adjust these routes dynamically to address changes in riders’ locations. The company is testing routes in Boston and hopes to expand as demand grows, potentially even out into the city’s populous suburbs, where it could relieve traffic on congested highways.
The U.S. men’s national soccer team administrators equipped each player with a GPS tracking device to monitor movements and curb injuries during this summer’s World Cup run. The devices are attached to players’ uniforms and are about the size of a matchbox. They provide data that can be combined with other information, such as heart rates and hydration levels, to detect when a player might be at risk of suffering an injury.
The Associated Press (AP) revealed this week that it would soon begin creating some of its corporate earnings stories algorithmically, transforming the data into sentences a human can read. The news service publishes around 1,200 of the formulaic stories annually, but hopes to expand to over 17,000 annually by the end of the year. For now, editors will read all the pieces before publication, but eventually the AP hopes to automate the entire process.
Researchers from the University of California, Irvine, have published a paper showing that deep learning techniques can be effective at detecting rare subatomic particles such as Higgs bosons. Deep learning, a branch of machine learning loosely based on the action of neurons in the human brain, is commonly used to model complex, nonlinear data. The researchers found the deep learning models to be up to eight percent more accurate than traditional detection methods. The researchers hope the methods could be applied to experiments at the Large Hadron Collider as early as 2015.
For the first time, artificial intelligence systems have performed on par with monkeys in a benchmark image recognition test. In the test, administered at the Massachusetts Institute of Technology, monkeys wired with electrodes watched a series of images and researchers measured how neurons fire in a particular brain region, indicating recognition. Afterwards, a computer was tasked with identifying the same set of images. Computers, which have traditionally performed far worse than monkeys on this test, made great improvements after the introduction of deep learning algorithms. Researchers hope the work is a harbinger of progress in other machine learning tasks.
Wearable biometric tracker company Lightwave wants to use data to improve audience engagement at concerts and other performances. The company’s eponymous device tracks body temperatures, motion, and decibel levels. The company piloted the device during an electronic dance music concert earlier this year, where it enabled attendees to participate in a dance contest for prizes. The company released a visualization of that show this week, showing how the audience’s physical state changed in response to various events during the concert.
San Diego Gas & Electric is investing in data analytics to manage its grid and ensure it responds effectively to weather threats such as fires. For example, the utility is using weather data analytics to help plan for outages and manage its workforce. The company has also developed a mobile application that transmits real-time weather data to firefighters and emergency responders.
IBM has created a system to predict where contaminated foods that cause illness outbreaks originate. Investigating such outbreaks is time-sensitive and public health officials, distributors, and food producers alike have a stake in knowing where a contaminant came from as quickly as possible. IBM’s system integrates retail inventory data with public health data containing details about victims, models the likely origin, and adaptively refines its estimates as more victim data comes in. The company is working with retailers and public health groups to scale up the project in the United States.
SurveyLA is a database created to house information on Los Angeles’s historic architecture and sites. The technology underlying the database was originally developed to catalogue heritage sites in Iraq and Jordan for the World Monuments Fund. The featured sites include venerable downtown office buildings and Hollywood movie theaters alongside civil rights demonstration locations and World War II air raid sirens. It also includes homes of famous Angelenos. Survey teams inspecting the city street-by-street expect to be finished by late 2015.
The United Nations (UN) Global Pulse initiative and social data provider DataSift announced a partnership this week to use data for social good. The company will provide the UN group access to its data and manipulation tools for use in humanitarian projects such as disease outbreak prediction and food security management. For example, the Global Pulse team will attempt to monitor childhood vaccination rates by comparing social media data and official statistics from aid organizations that track the number of people vaccinated.