This week’s list of data news highlights covers April 1 – 7, 2017, and includes articles about a new bill to codify U.S. federal open data requirements and a machine learning system that can help diagnose depression.
The U.S. Department of Veterans Affairs Assistive Technology (AT) program is helping physically disabled veterans live more autonomously in their homes by setting them up with connected smart home devices. If a veteran’s doctor believes he or she could benefit from smart home technology, such as voice-controlled lighting for veterans with mobility impairment, the AT program identifies different technologies that would best meet the veteran’s needs. The AT program makes a wide array of smart home products available to veterans, including the Amazon Echo and Nest Learning Thermostat.
FarmView, a Carnegie Mellon University initiative, run in partnership with agricultural organizations and other universities, is developing AI systems that can help automate agricultural processes and make them more efficient. For one project, FarmView developed a robotic system that uses sensors and software similar to those of self-driving cars to navigate itself through a field of sorghum, uses computer vision to identify signs of disease on the plants, and uses lasers to measure plant height and volume. In another project, FarmView developed a computer vision system that could analyze images of grape vines to automatically count the number of grapes and leaves, which can reveal if the vine is getting enough water.
Google has deployed a suite of machine learning tools to help analyze YouTube videos and identify potentially objectionable content to avoid inadvertently pairing companies’ advertisements with videos that promote content such as violence or racism. Because so much content is added to YouTube on an ongoing basis, having humans review even just a small percentage of videos would be incredibly resource intensive. Google is training the tools on human-verified examples of both safe and objectionable content, having them analyze imagery as well as language and video metadata to teach the algorithm to flag objectionable content as inappropriate content for advertising as it is uploaded.
The “Ahead 300,” a handheld electroencephalography (EEG) device that uses a disposable sensor-laden headset, has completed a clinical trial that has shown it to be able to rapidly detect if a person with a head injury has a brain bleed with 97 percent accuracy. The device, developed in partnership with the U.S. Department of Defense and medical device firm BrainScope Company, could be used to help avoid subjecting patients with head injuries to computerized tomography (CT) scans, which can be costly. Since brain bleeds can be quite serious, many patients are directed to get CT scans after a head injury though studies have shown that over 90 percent of scans reveal that patients are not suffering from brain bleeds.
Australia’s National Health and Medical Research Council has launched a 20,000 person genomics study that will attempt to identify genetic markers related to depression that could reveal why some people respond positively to antidepressant medications while others do not. Participants’ DNA will be sequenced and this data will contribute to an international study of 200,000 individuals with depression. With this data, researchers hope to identify why some people have a greater risk of developing depression, as well as why some patients respond positively to antidepressant medications while others do not.
IBM, along with researchers at Harvard University and Princeton University, have developed a miniature sensing device that can continuously monitor the air for the presence of methane molecules based on how they distort a beam of laser light. The device is just five square millimeters and can also detect the concentration of methane in the air. The device could help gas companies remotely monitor large portions of oil and gas wells to detect methane leaks substantially more efficiently than traditional methods, which rely on infrared cameras and human oversight.
The U.S. Food and Drug Administration (FDA) has decided that consumer genomics company 23andMe can provide customers with information about their risk levels for diseases based on their genetic data. The FDA has prevented consumer genomics companies from offering disease risk services for years due to potential concerns about the accuracy or risk of such analysis, but it has since decided that it is allowable as long as the genetic tests are not designed to diagnose conditions or inform treatment decisions. 23andMe can now inform customers about their genetic risk of 10 different diseases, including Parkinson’s disease and late-onset Alzheimer’s disease.
Google has developed a new method for developing machine learning models called federated learning that allows it to use data stored locally on smartphones. As Google develops a machine learning model, Android smartphone users’ activity will help train the model locally, and the system will only transmit encrypted updates back to Google as it learns to help improve the system as a whole, but without forcing consumers to share their data. Google is initially experimenting with federated learning for text prediction on its keyboard for Android smartphones, called Gboard.
A group of journal publishers and scientific organizations including the Wikimedia Foundation and SAGE Publishing have launched a project called the Initiative for Open Citations (I4OC) which aims to make citation data from scientific articles freely available to the public. I4OC has partnered with 29 publishers to release citation data from 14 million papers indexed by nonprofit scholarly information collaboration Crossref—40 percent of all the papers it indexes. Prior to I4OC, publishers were making citation data freely available for just one percent of the papers on Crossref. Open citation data can help scientists stay better informed about research in their field as well as support the development of new scholarly tools and services.
Facebook has implemented a photo-matching algorithm designed to prevent the sharing of sexual images without the consent of their subjects—also known as revenge porn. If someone shares a sexual image and another user flags it as being shared without consent, the system will remove the image and can automatically detect whenever someone tries to reupload that same image and block it. The system has been deployed to both Facebook and Instagram, which Facebook owns.