10 Bits: the Data news Hotlist
This week’s list of data news highlights covers July 15 – 21, 2017, and includes articles about China’s new plan to lead in AI and a bill that aims to close the LGBT data gap.
The Chinese State Council has published a plan to make the country the world leader in artificial intelligence by 2030. The plan details strategies to invest billions of dollars in ambitious moonshot projects, academic research, and startups so that the country’s private sector and research institutions are on par with the United States in AI by 2020. The plan also calls for China to push ahead of other countries by 2030 by focusing on breakthroughs in AI key to economic growth.
Researchers at the Massachusetts Institute of Technology have developed an AI system called Pic2Recipe capable of analyzing images of food items and predicting the ingredients and recipes used to make them. The researchers first compiled a database of over 1 million annotated recipes taken from cooking websites and trained a neural network to make connections between pictures associated with each recipe and the ingredients used.
Health technology company Proteus Digital Health and pharmaceutical company Otsuka have developed a system that uses a body-worn sensor and smartphone app to automatically track if patients are adhering to their medication regimen for the antipsychotic drug Abilify. The sensor, which adheres to a patient’s body like a bandage, can detect when patients swallow Abilify pills which contain small microchips. The sensor shares this data with a smartphone app that can automatically notify doctors when patients fail to take their medicine—a particularly common occurrence in patients with schizophrenia, which Abilify treats.
96 members of Congress have introduced the LGBT Data Inclusion Act in both the House and Senate to direct federal agencies that collect demographic data from surveys to also include questions about sexual orientation and gender identity. Policymakers and administrators use federal survey data to inform decisions about public health, housing, and other important government services, yet most federal surveys do not include questions about sexual orientation or gender identity, limiting the government’s ability to understand and address the needs of the LGBT community.
Researchers at the University of Illinois have created a 1.2 microsecond simulation of the 64 million atoms that make up an HIV capsid—a protein that transports the HIV virus into human cells—with the help of the Department of Energy’s Titan supercomputer. The researchers spent two years modelling the behavior of the millions of atoms in an HIV capsid and by putting the whole thing together, the simulation revealed interesting characteristics, such as how different parts of the protein oscillate at different frequencies. By better understanding how HIV capsids function on an atomic level, researchers can identify vulnerabilities that could potentially lead to the development of new treatments.
Google’s machine learning research branch Google Brain has launched a challenge on Kaggle to have researchers develop machine learning systems that can confuse and defend against one another’s cyberattacks. This approach, known as adversarial machine learning, can help researchers develop machine learning cybersecurity systems that are harder to bypass. For example, it could be used to anticipate how spammers will try to evade email spam filters by learning and exploiting the parameters of a filtering algorithm.
Boston-based startup Experfy is developing a platform to make it easier for employers to find workers with data science expertise. Experfy scores data scientists on their platform with an algorithm to match companies with ideal candidates. Companies can also post projects on Experfy and hire interested data scientists on an on-demand basis.
UK Biobank, a medical research charity, has partnered with the European Genome-phenome Archive (EGA), a public repository of genetic information run by research labs in Europe, to distribute its genomic data about 500,000 individuals. UK Biobank collected the data during a study that ran from 2006 to 2010, along with participants’ electronic health records, and it has been distributing phenotype data from the study. Now, EGA will share the genetic portion of the data, enabling approved researchers to download and match data from both sources to aid their experiments.
Computer scientists at the Montreal Institute of Learning Algorithms have developed a machine learning system that can generate realistic landscape models after training on satellite imagery of Earth. In virtual environments such as video games, developers either use simple algorithms to generate a landscape, which is inexpensive but produces low-quality results, or manually design every environment to produce higher-quality results but at greater cost. The scientists had one deep learning algorithm generate random environments, then trained another on high-resolution satellite imagery from NASA and had it evaluate the randomly generated environments to provide feedback to the first, eventually leading to it producing realistic artificial environments.
Pittsburgh has deployed a system developed by transportation and robotics researchers at Carnegie Mellon University called Surtrac that uses AI to manage traffic light patterns. In areas where the system is in use, it reduces travel time by 25 percent, braking by 30 percent, and idling by 40 percent. Surtrac uses cameras and radar at intersections to detect traffic and then develops predictive models to determine how signal changes might impact the flow of traffic, and adjusts lights accordingly without humans needing to reprogram them. Surtrac is in use at 50 intersections, and Pittsburgh is in the process of deploying it at 150 more.