10 Bits: the Data News Hotlist
This week’s list of data news highlights covers May 14-20, 2016 and includes articles aboutthe first artifically intelligent lawer and a new database that makes Ireland’s nonprofit sector more transparent.
Rutgers University and IBM’s World Community Grid, an initiative to allow volunteers to contribute unused computing power for scientific research, have launched OpenZika, a project to allow researchers developing a treatment for the Zika virus to use the spare processing power from the smartphones and computers of volunteers. Developing a treatment for a disease involves analyzing and testing millions of drug compounds and protein structures, which requires massive amounts of computing power beyond the reach of most scientists. OpenZika will allow scientists working on a cure for the Zika virus to quickly analyze drug databases, conduct simulated experiments, and share results with the help of the processing power of the the three million computers and mobile devices World Community Grid volunteers have made available—the combined total of which is equivalent to one of the world’s most powerful supercomputers.
The Western Australian government has launched WA Emergency Waiting Times, a smartphone app that uses open data about hospital wait times to give residents in the city of Perth an accurate estimate of how long they should expect to wait for non life-threatening emergencies at nearby hospitals. The app also combines the wait time with geolocation and mapping data to give users an estimate of the total amount of time they would spend, including travel time, before they could get treatment at these hospitals.
U.S. law firm BakerHostetler has partnered with startup ROSS Intelligence to use its artificial intelligence software ROSS, powered by IBM’s Watson cognitive computing platform, to conduct legal research. ROSS is capable of rapidly analyzing traditional legal resources, such as case law libraries and legislation, interpreting court decisions and other unstructured data, and automatically flagging new, related court decisions as they are issued to provide lawyers with the most relevant and current information about their cases. By making the legal research process more efficient, law firms could substantially reduce overhead costs and potentially make legal services more accessible.
DeCode Genetics, an Icelandic genomics company owned by pharmaceutical firm Amgen, analyzed the DNA of 300,000 Icelanders to identify a genetic variant that reduces heart attack risk by 35 percent. DeCode’s analysis revealed that only 1 in 120 Icelanders have the particular variant of a gene named ASGR1, and by analyzing their medical records, demonstrated that the variant is linked to lower levels of bad cholesterol and a life expectancy of one-and-a-half years longer than average. Amgen is now developing a drug that can mimic the effects of the gene and expects that its strategy of mining genetic databanks to advance drug discovery could reduce the average 14-year process of bringing a new drug to market by 18 months.
Irish nongovernmental organization Benefacts has launched the Benefacts database, a publicly accessible database of information on how organizations that make up Ireland’s $7.8 billion nonprofit sector spend their money. The government already publishes data on how nonprofits, such as trade associations, sports teams, and charities, spend their money, but does so in a fragmented manner through nine different regulatory agencies. Benefacts pulls all of this data together in an easy-to-interpret format and makes it available through its website as well as an application programming interface so developers can pull this data into their own applications.
Australian startup Nura has developed headphones that analyze data about a wearer’s inner ear to develop a personalized profile of music levels designed to make music sound better for the wearer. Nura learns about a wearer’s inner ear by playing a series of tones and using a microphone designed to record otoacoustic emissions—subtle sounds produced in the cochlea in response to stimulation, which vary from person to person. By analyzing this data, Nura can tell different wearers apart and switch between sound profiles optimized to make music sound better based on the unique physiology of wearers’ ears.
A coalition of researchers from Lawrence Berkeley National Laboratory and six universities are using the Department of Energy’s Titan supercomputer to devise ways to clean up the remaining radioactive waste from World War II, which requires careful analysis to remove and store safely. Sites that manufactured nuclear material during the war disposed large amounts of waste in storage tanks, a third of which are now leaking, and it can be difficult to tell the specific type of waste in each tank. The researchers are using Titan to carry out complex simulations of the chemical reactions involved with various potential decontamination and removal methods and learn of any adverse effects.
Strayer University has piloted a predictive analytics system that can learn student study habits and intervene when a student in an online course is at risk of performing poorly or failing a class. The system analyzes how much time students spend on online learning activities, the frequency and type of students’ posts on online class forums, and when students perform these activities, and sends carefully timed, personalized messages to students when this data indicates they are at risk of underperforming. After the 11-week pilot, course drop rates fell by 7 percent, student performance increased by 12 percent, and the number of students considered at-risk dropped by 17 percent.
The U.S. Federal Communications Commission (FCC) has launched the Consumer Complaint Data Center, a publicly accessible portal that catalogues complaints received by the FCC and allows users to analyze these complaints with data visualization tools. The FCC does not make the text of each complaint available, but tags each complaint with metadata about when and where it was made, the type of complaint, and other criteria. The Consumer Complaint Data Center reveals trends in what kinds of issues the FCC receives and can help policymakers identify problematic areas, such as robocalling or billing issues, that might require regulatory action.
The U.S. Food and Drug Administration (FDA) has released draft guidance on using data from electronic health records (EHRs) in clinical trials. The guidance includes recommendations for health-care organizations on best practices for using EHR data in trials, including how to ensure data integrity, address interoperability issues between health IT systems, and protect against improper use of patient data. By promoting best practices for EHR data use, the FDA aims to reduce the length and cost of the clinical trial process, such as by streamlining the patient recruitment process.