10 Bits: the Data News Hotlist
This week’s list of data news highlights covers July 16-22, 2016 and includes articles about using artificial intelligence to reduce energy consumption and Yelp’s new machine learning system that can identify the contents of users’ photos.
Researchers at Chinese search company Baidu’s Big Data Lab have developed a method for studying economic activity by analyzing millions of data points on users’ locations. The researchers identified thousands of areas of economic activity in the country, including offices, shopping centers, and industrial areas, and examined location data provided by Baidu users to measure changes in the amount of people in these areas from the end of 2014 through mid-2016. This analysis demonstrates that location data could be a valuable predictor of economic success and failure, as the data showed distinct declines in attendance in manufacturing plants months before they closed, as well as serve as an employment index for China by revealing how many people are visiting different kinds of economic areas over time.
Researchers at Delft University in the Netherlands have developed a technique for storing and rewriting large amounts of data at the atomic scale, capable of storing 62.5 terabytes in a single square inch. The technique involves manipulating a grid of chlorine atoms on a copper surface into binary arrangements to express the data, and while researchers have demonstrated that atomic-scale data storage was possible, this new technique is substantially faster and can store and re-write larger amounts of data than ever before. Though this new approach requires demanding laboratory conditions, the researchers believe their technique will serve as a crucial proof-of-concept for making atomic-scale storage viable for everyday use.
Google has implemented an artificial intelligence system developed by fellow Alphabet subsidiary DeepMind to manage portions of its data centers, resulting in a 15 percent improvement in energy efficiency. The system controls 120 variables in Google’s data centers that relate to energy use, including fans, cooling systems, and windows, and learns to automatically adjust these variables to maximize efficiency. Google reported it used 4,402,836 megawatt hours of electricity in 2014, a large portion of which supported its data centers, so a 15 percent reduction translates to dramatic cost and energy savings.
Researchers at Tufts University have developed conductive thread that can be used to create smart sutures that collect data about how a wound is healing. The researchers covered cotton threads with conductive material and then dipped them into chemical and physical sensing compounds that can monitor temperature, strain, pressure, glucose levels, and pH. These threads connect to circuitry that wirelessly transmits this data to a computer, which can analyze this information and determine how well a wound is healing and identify if an infection has developed so doctors can take corrective action.
Neuroscientists at the Washington University Medical School working on the Human Connectome Project, an initiative to improve how brain connectivity data is recorded and shared, have developed a machine-learning system that can create more accurate, granular maps of human brains. The scientists trained a machine-learning algorithm to identify physical characteristics of brains that indicate different regions and function, such as its connectivity to other regions and its thickness, and map out these regions, which can vary slightly from person to person, on scans of brains it has not seen before. Making it easier to map out a brain and study its regions could allow doctors to develop better diagnostics for brain disorders and better understand brain function.
Yelp has implemented a deep learning system that can identify the contents of users’ photos of restaurants to improve how it provides search results and recommendations. Yelp normally filters search results based on description text and metadata, such as whether or not the restaurant is tagged as “Mexican food” or “good for kids,” but this approach hinges on whether or not adequate descriptions and metadata exist for a restaraunt. Yelp’s new system analyzes user-submitted photos from different establishments to determine these attributes with 83 percent accuracy. For example, if it detects a user-submitted photo with a burrito in it, it can predict that a restaurant serves Mexican food and generate metadata for that restaurant.
Computer graphics company Nvidia has developed a technique to improve the realism of virtual reality experiences while reducing the computational demand of virtual reality software by using eye-tracking technology to mimic how humans focus in the real world. When humans look at something, the subject of their focus is sharp while their peripheral vision is blurrier—a phenomenon called foveal vision. Rather than render an entire virtual reality scene at maximum resolution, Nvidia’s technique uses eye-tracking software to replicate foveal vision and only fully render where a user is looking, leaving the rest of the scene at a lower resolution and significantly reducing the amount of computation needed to create a realistic scene. Nvidia’s approach was made possible thanks to advances in eye-tracking technology that can record eye movement with very low-latency sensors.
Researchers at Internet.org, a Facebook-led initiative to expand Internet access, have developed light-detection technology that can be used to wirelessly transmit and receive data through free-space optical communication—transmitting information through beams of light over the air, rather than through fiber-optic cable. Previous attempts to develop free-space optical communication have been limited by the need to have light detectors large enough and maneuverable enough to capture incoming light without sacrificing the volume of data it could transmit. The researchers developed a method that can quickly transmit data at a rate of two gigabits per second—substantially faster than radio transmission—without needing to maneuver to intercept incoming light. This technique could be used to provide Internet coverage to large areas through relays of these detectors, potentially mounted on drones.
A team of researchers from the University of Georgia, Massey University, and the University of California have developed a machine learning model that can predict species of bats likely to transmit filoviruses, which include Ebola and other diseases, to aid efforts to eradicate the disease. Bats have long been suspected to transmit Ebola viruses, but identifying which bats carry the disease has been difficult as bats carrying filoviruses generally don’t exhibit symptoms. The researchers used a machine learning algorithm to analyze 57 variables on bat species, such as diet and migratory pattern, to identify certain traits that make a species a likely carrier. Based on this analysis, the system was able to sort through the 1,116 known bat species in the world and map out where the territories of likely carriers overlap, indicating hotspots with the highest concentration of carrier species. The researchers’ model indicates several areas in southeast Asia where 26 such species overlap.
U.S. Representative Justin Amash (R-MI) has introduced a bill, H.R. 5760, to require federal legislation and congressional records to use standardized, machine-readable formats. Currently, most congressional activity creates and is recorded on non-machine-readable documents, which makes it difficult for the public and congressional staff alike to search and analyze this data. H.R. 5760 would also direct the Clerk of the House and the Secretary of the Senate to lead a task force to establish and implement a standardized data format for legislative activity.