This week’s list of data news highlights covers June 30 – July 6, 2018, and includes articles about home security systems using AI and an effort to sequence the koala genome.
Researchers at DeepMind have created an AI system capable of beating humans at the video game Quake III with minimal human input. The researchers had AI agents repeatedly play Quake III’s “capture the flag” game mode against each other without any instructions, and introduced 30 agents trained to represent different play styles to the mix. Unlike other efforts to achieve human-level performance in video games, DeepMind did not give its bots access to raw data about the game, allowing them to only use the same visual information that a human player would see to train. After approximately half a million 5-minute games, the AI system was able to learn basic rules as well as strategies, such as following teammates to gang up on enemies.
Home security companies are increasingly using AI to better protect customers’ homes. For example, Pittsburgh firm Edgeworth Security uses AI to analyze security camera video and automatically detect when a person comes on to the property or loiters nearby and uses facial recognition to differentiate between regular visitors, such as mail carriers or gardeners, and strangers. AI-powered security systems could help make home security more reliable, as a 2011 study from the U.S. Department of Justice found that burglar alarms, while cheaper than human security guards, generated mostly false alarms.
Researchers at the University of California, Merced have developed a machine learning system that can generate an approximate representation of the ground-level view of an image taken via satellite. The researchers used a generative adversarial network, in which one neural network generates a ground-level image based on a satellite image and another neural network evaluates this image and provide feedback so that, over time, the generated images become more accurate. This approach could allow geographers to more easily classify land based on its use, such as whether it is farmland or an urban area, which typically requires ground-level images, based solely on satellite imagery.
Researchers at the University of Sydney have for the first time sequenced the koala genome in hopes that it will improve conservation efforts. Koalas face unique threats, including their exclusive diet of a particular species of poisonous eucalyptus with limited nutritional value, habitat destruction causing populations to be isolated, and a widespread chlamydia epidemic. With this genetic data, scientists can begin experimenting with activating or deactivating genes that could improve koalas’ resilience, such as genes that influence digestive enzyme production to enable koalas to have a more diverse diet, or developing vaccines to improve koalas’ resistance to chlamydia, which affects a large amount of their population and causes blindness, which makes them unable to forage or escape predators.
Scientists at the California Institute of Technology have created an artificial neural network that uses engineered strands of DNA to process data. The researchers created molecular sequences of DNA strands representing individual pixels in a 10 by 10 grid, so that combined they would translate to a small image of a handwritten digit from one to nine. Then, rather than train an algorithm to recognize these patterns, the scientists engineered strands of DNA that could carry out chemical reactions to classify these DNA sequences based on the digit they represent and exhibit fluorescence based on the digit it identifies.
A startup called Tagwalk is using AI to power a search engine for fashion, allowing users to search for fashion based on color, fabric, season, brand, and other parameters. Tagwalk, like other search engines for clothes, initially used humans to tag items based on these factors, but can now automate most of this process, relying on humans only for confirmation. Tagwalk also uses AI to analyze what its users search for, allowing it to develop insights into emerging fashion trends.
England’s National Health System (NHS) will being routinely offering genomic medicine on October 1, 2018. NHS hospitals will offer patients the opportunity to share their genetic data with specialists to help diagnose rare diseases, identify effective treatments, and avoid adverse drug reactions. NHS will also routinely sequence cancer patients’ tumor DNA to help develop new treatments and connect doctors and patients with clinical trials or experimental therapies that could be beneficial.
Researchers at the Massachusetts Institute of Technology have developed music editing software called PixelPlayer that uses AI to analyze video of a musical performance and isolate sounds coming from different instruments so users can edit it. Unlike traditional music mixing which requires multiple channels of audio data corresponding to different instruments, PixelPlayer allows users to adjust the volume of different instruments just from video footage. PixelPlayer can identify the sounds of 20 common instruments, but its developers say it could learn to identify more given more training data.
Researchers from universities in India, China, and Malaysia have developed a machine learning system that can analyze handwritten English text and determine if the writer is from Bangladesh, China, India, Iran, or Malaysia. The researchers created a dataset of 100 people from each country that could write in English and trained an algorithm to analyze the different properties of the text. The system learned that writers from different countries exhibit unique traits in their writing, such as how Bangladeshi and Indian writers use curvy script while Chinese writers use straighter lines.
Researchers at the European University Institute in Florence have developed a prototype AI tool called Claudette that can analyze technology companies’ terms and conditions and identify language that potentially runs afoul of the EU’s General Data Protection Regulation (GDPR). Claudette flags sentences in terms and conditions documents that contain unclear or insufficient language to satisfy GDPR requirements, such as sentences that do not fully inform users about how companies share data with third parties.
Image: Mathias Appel.