This week’s list of data news highlights covers July 7-13, 2018, and includes articles about a machine learning system that can generate molecules that could be use for drugs and an analytics system that can predict employees’ future performance.
Researchers at Carnegie Mellon University have developed a method for optimizing traffic flows through intersections using vehicle-to-vehicle communication. Modern vehicles increasingly come equipped with short-range radio systems that can transmit data about their speed, direction, and location, as well as receive this data from other vehicles. The researchers developed a virtual traffic light protocol that gives drivers in different vehicles green or red lights based on this data to inform them whether they have the right of way at an intersection. In a simulation, the system reduced commute times by over 20 percent for unsignaled intersections.
A company called Celmatix has developed algorithms that can create personalized predictions of fertility likelihood for women. Celmatix, which provides this service at 10 fertility clinics in the United States, developed its program, called MyFertility Compass, with the data from nearly 1 million fertility treatment outcomes, and uses a 14-question survey about a person’s physical attributes and lifestyles to predict how likely a woman is to become pregnant within 12 months. Celmatix has also partnered with 23andMe to study how genetic data, combined with clinical, environmental, lifestyle, and other data, influences fertility chances.
Researchers at the Massachusetts Institute of Technology have developed a machine learning model that can generate and improve upon simulated molecules that could lead to new drugs. Developing new medicines can be very labor intensive as it can be time consuming to build and adjust new molecules, and these new compounds may not perform as intended. The researchers trained a machine learning system on 250,000 molecular structures and had it generate new molecules based on certain desired properties. In tests, the system could produce molecules that function as intended 100 percent of the time, whereas other automated approaches could only do so 44 percent of the time, and could both generate ideal base molecules and improve upon base molecules more reliably than existing methods.
Researchers at Siemens and the University of California, Berkeley have launched a program called Dex-Net as a Service to allow users to help teach an AI system how a robot should best grasp an object. Robotic grasping is challenging, as it requires delicate physical manipulation and an understanding of how to actually pick something up, which comes naturally to humans but is difficult to program. Dex-Net as a Service allows users to upload 3D renderings of objects and it will predict which parts of the object are the most viable to grasp with a robotic hand while giving users the opportunity to provide feedback.
The U.S. Food and Drug Administration (FDA) has announced plans to use electronic health record (EHR) data from 10 million people to improve its post-market monitoring system, which evaluates the efficacy and safety of medical products already in use. The FDA currently relies on healthcare payer claims data for post-market monitoring, which is limited in its usefulness because it is not timely. The FDA hopes its new system could allow it to use de-identified EHR data in near real-time.
Researchers form Nvidia, Aalto University, and the Massachusetts Institute of Technology have developed an AI system that can enhance a grainy image without information about what the clear image looks like. AI systems that can remove image grain, common in low-light images, already exist, but require a photo without grain in addition to the original to learn how to enhance it. The researchers’ system can substantially reduce grain in low-light photos and medical imagery in milliseconds.
IBM has developed an analytics system that can analyze information about an employee’s experiences and progression to help managers make decisions about bonuses, pay, and promotions. The system can assess projects employees have completed as well as analyze IBM’s internal training system to determine how employees are developing different skills and predict how well an employee is likely to perform in the future. After using the system internally to help determine employee advancement, IBM spot-checked its predictions against employee performance and found it to be 96 percent accurate.
Facebook has agreed to share data with an independent research group called Social Science One to help researchers study the spread of misinformation and its impact on elections. Facebook will share an initial one petabyte dataset of public Facebook posts with anonymized information about the demographics of people who interacted with the posts and how they engaged with it. Social Science One will allow researchers to apply for funding and access to the data and will control the approval process.
Researchers at the University of Minnesota have developed a machine learning system that can analyze EHR data from seriously ill patients and accurately predict their mortality risk over the course of a year. The system analyzes data such as demographic information, blood work, and other factors to predict mortality risk for up to a year after a patient’s last day in the hospital. This system could allow doctors and patients to make better informed decisions about end-of-life care, such as whether to perform invasive procedures that are not likely to reduce mortality risk.
Researchers at Northwestern University analyzed data about the careers of 3,480 artists, 6,233 film directors, and 20,040 scientists to determine that a person is most likely to create their most successful works over a four-to five-year period. The researchers used metrics for success for various fields, such as citations garnered for scientific research, auction prices of artwork, and movie review scores and found that 91 percent of artists, 82 percent of directors, and 90 percent of scientists experience at least one hot streak in their careers, in which they produce several high-impact works in a sequence, despite no changes in productivity during these periods.