10 Bits: the Data News Hotlist
This week’s list of data news highlights covers July 30 – August 5, 2016 and includes articles about how Facebook is cracking down on clickbait and a robotic onesie that could fight cerebral palsy in infants.
Researchers from the University of Pennsylvania and Massachusetts General Hospital have analyzed genetic data from consumer genomics company 23andMe to reveal 15 genetic variations linked with depression. The researchers analyzed anonymized data from 75,607 23andMe customers of European descent who self-reported having depression and compared the genetic variations against the genetic data of 231,747 healthy people, making this the first large-scale study on depression for people with European ancestry. The study revealed that the genetic regions associated with depression are related to neuron development in the brain and are linked with psychiatric disorders such as schizophrenia and bipolar disorder.
Baidu has developed an augmented reality platform called DuSee that uses computer vision and deep learning algorithms to overlay virtual content over users’ smartphone videos. Unlike other systems such as Pokémon Go, which just insert an animation over a video, or more complex systems that use advanced 3D mapping technology, DuSee interprets smartphone camera data to add dynamic content that responds as the image changes or the camera moves. Baidu will roll out DuSee into its collection of smartphone apps, including its popular Mobile Baidu search app.
Facebook has updated its algorithm that determines what appears on users’ news feeds to automatically identify and reduce the visibility of links users share with clickbait headlines. Facebook analyzed tens of thousands of headlines to identify phrases that indicate a headline is clickbait based on if the headline withholds information necessary for a user to understand an article’s content, or if the headline creates misleading expectations by exaggerating an article’s content. The updated news feed algorithm will analyze headlines and filter out clickbait similar to how an email spam filter hides unwanted messages, as well as identify pages or domains that consistently post clickbait and reduce their visibility.
New Mexico-based startup Descartes Labs has developed a machine learning system that can analyze satellite imagery of farmland to predict corn yields more accurately than the U.S. Department of Agriculture’s (USDA’s) official estimates. Descartes’ algorithm consistently beats USDA’s predictions for every period in the growing season with just a 2.5 percent margin of error by analyzing daily satellite data of every farm in the United States and updating its yield prediction every other day. The entire agricultural supply chain, including farmers, insurance companies, and commodities traders, rely on USDA’s estimates for planning purposes, and with more better data from Descartes, they could make smarter decisions.
The U.S. Office of Management and Budget has finalized its Data Center Optimization Initiative (DCOI) to improve how the federal government manages its data centers. DCOI forbids federal agencies from allocating funds to create or expand data centers, with a few exceptions, to encourage agencies to make more efficient use of their existing data center resources. Under the new policy, federal agencies must complete a review of their data center inventories by the end of August 2016, close 52 percent of federal data centers, and reduce the real estate footprint of data cata centers by 31 percent, which is expected to result in $2.7 billion in yearly savings by 2018.
Researchers at the Massachusetts Institute of Technology have developed a technique called Interactive Dynamic Video (IDV) that uses algorithms to analyze how objects in a video could move and allow a viewer to push and pull objects on the screen to see how they react. When an object moves, it creates subtle, almost imperceptible vibrations in response to its environment. Normally, modelling movement requires costly 3D modelling techniques, but with IDV, algorithms analyze these vibrations in video captured with a traditional camera to extrapolate how an object would move under different circumstances.
The 2016 Rio Olympics will feature a large number of athletes that rely on a variety of different data-driven approaches to improving their performance. For example, the boxing team from Great Britain has been using analytics software developed by Sheffield Hallam University called iBoxer that analyzes large amounts of data on boxers and potential opponents to identify the best tactics to use in a particular matchup. The U.S. track cycling team has been training with augmented reality glasses that display each cyclist’s performance data in real time so the athletes can better monitor their training. And the German sailing team has partnered with analytics firm SAP to develop virtual models that can help the team better understand how to respond to changing wind and ocean current conditions.
Researchers at the University of Oklahoma have developed a robotic, sensor-laden onesie that captures granular data about how infants move and could help slow the progression of cerebral palsy. The onesie captures 50 data points per second about an infant’s arm and leg positions and a connected robotic apparatus helps the infant crawl in his or her intended direction. Cerebral palsy, which affects muscle strength and motor skills, is incurable, but if detected early enough, physical therapy can slow the progression of the disease. The researchers are carrying out a study to more accurately measure the onesie’s effectiveness after an initial prototype markedly improved infants’ motor skills.
The U.S. Defense Advanced Research Projects Agency (DARPA) concluded its Cyber Grand Challenge, a contest to develop fully automated cybersecurity software, declaring startup ForAllSecure the preliminary winner for its autonomous bot named Mayhem. The challenge pitted teams against each other to have their software race to identify and exploit bugs in a set of machines while responding to attacks and fixing their own vulnerabilities. Normally, identifying and repairing bugs can be incredibly difficult and require extensive human oversight, but Mayhem and other bots were able to rapidly find and analyze bugs, including ones DARPA were not even aware of. Mayhem beat out the competition by using statistical analysis and game theory to prioritize which of its own bugs to patch without compromising its performance.
Scientists at IBM have successfully developed artificial neurons that can store and process data in a similar way to neurons in the human brain. The artificial neurons use phase-change materials— materials capable of storing and release a large amount of energy—that respond to electrical impulses the same way that a neuron fires to store data in an analog, rather than digital, format. The artificial neurons, like real neurons, only require very small amounts of energy to perform complicated computations, making them adept at tasks like high-speed unsupervised learning. IBM’s breakthrough could allow for more powerful nano-scale computing and advanced machine learning applications.
Image: Caroline Granycome.