This week’s list of data news highlights covers June 17-23, 2017, and includes articles about New Zealand’s new commitments to Open Data and an AI system protecting the Large Hadron Collider.
Researchers at New York University and scientific research organization the Kavli Foundation have begun to recruit participants for the Human Project, a 20-year research initiative to learn as much as possible about the relationships between human biology, behavior, and the environment. The Human Project will recruit 10,000 participants in New York to donate data about every aspect of their lives, ranging from financial records to fitness tracker data and urine samples. The Human Project will use this data to develop a research platform scientists can use to advance a wide variety of disciplines.
The U.S. Department of Homeland Security (DHS) and Google have launched a contest on data science challenge platform Kaggle to develop neural networks that can improve the security screening process at airports. DHS will provide challenge participants with over 1,000 3D body scans from security checkpoint body scanners and will award $1.5 million to the team that develops the best algorithm for automatically identifying concealed items.
The European Organization for Nuclear Research (CERN) is developing machine learning software to protect the computer grid that manages the data from the Large Hadron Collider (LHC) from cyber attacks. The LHC will generate 50 petabytes of data in 2017 alone, and this computer grid allows over 8,000 researchers at 170 institutions across 40 countries working with the LHC to access, share, and analyze this data. The scope of this system poses significant challenges to standard cybersecurity approaches that could limit the accessibility of this data. CERN is training its software to detect potentially hostile intrusions into this grid and automatically shut down suspicious activity without affecting performance as a whole.
Researchers at the Massachusetts Institute of Technology have developed a method for improving the resolution of magnetic resonance imaging (MRI) scans that make them more useful for diagnostics. MRI scans produce 2D scans of one millimeter cross-sections of a patient’s brain that are typically spaced five to seven millimeters apart, which software then recreates into a 3D model of the brain. Doctors are forced to keep the resolution of these scans low due to the limited availability of MRI machines. The researchers’ new method uses algorithms that extrapolate from a set of scans and can generate the missing portions of anatomical structures in low-resolution areas to make a more representative 3D model.
New Zealand Minister of Statistics Scott Simpson has announced that the government will invest NZ $7.2 million (US $5.25 million) over the next three years to accelerate the publication of government data as open data. New Zealand has prioritized the publication of open data to help fulfill its Business Growth Agenda by 2025, which aims to encourage business confidence and spur private sector investment.
Nashville-based AI firm Skopos Labs has developed an AI system that can analyze the language in bills proposed in Congress and determine the likelihood they will pass. Skopos Labs trained its system on legislative data from the 103rd Congress in 1993 through the 113th Congress, which ended in 2015, to teach it to interpret language and identify relationships between outcomes and contextual variables in a bill’s language. Bills fail 96 percent of the time, so it is a safe bet to simply assume a bill will fail, but Skopos Labs’ system was able to predict the likelihood of a bill’s passage 65 percent more accurately than just predicting a bill would not pass by default.
NASA’s Curiosity rover has been using autonomous technology called Autonomous Exploration for Gathering Increased Science (AEGIS) since May 2016 to automate and accelerate its data collection as it travels across Mars. AEGIS allows Curiosity to operate an analysis tool called ChemCam without human input, substantially increasing its productivity as it eliminates the need to rely on signals that can take up to 20 minutes to reach Mars from Earth.
Researchers from Pennsylvania State University and weather forecasting company AccuWeather have developed an algorithm that can identify subtle patterns in radar images associated with potentially dangerous winds. In radar imagery, when a portion of a line of thunderstorms moves faster than another, a bow-shaped pattern called a bow echo appears, which can indicate where the most serious damage from a storm will occur. The researchers’ algorithm can automatically spot the early formation of bow echoes, which could help meteorologists more quickly identify if a storm will become dangerous and improve preparation efforts.
Researchers at Google have developed a machine learning framework called MultiModel capable of accomplishing multiple kinds of tasks. Unlike most machine learning systems, which focus on single specific tasks such as image recognition, the researchers trained MultiModel on a variety of tasks, including image recognition, translation, and speech recognition. Though MultiModel is not necessarily more effective than a machine learning system with a single focus, it is able to improve its accuracy in individual tasks the more it trained on others. As a result, this approach could lead to broadly applicable machine learning systems that do not need substantial amounts of training data for each type of task they perform.
Laproscopic surgery, which involves a fiber-optic camera inserted into a patient to guide surgery, can generate huge amounts of video data that largely go unused for training due to the effort involved in sifting through hours of videos to identify specific relevant stages of an operation. Researchers at the Massachusetts Institute of Technology have developed a machine learning system that can sift through these video files and identify specific stages of the surgical process, making it easier for doctors to jump to specific portions of videos without having to review the whole file.
Image: NASA Jet Propulsion Laboratory.