Weekly News Satellite images are among the data that NOAA collects.

Published on February 28th, 2014 | by Travis Korte

0

10 Bits: The Data News Hot List

This week’s list of data news highlights covers February 22-28 and includes articles on a TV network’s plan to leverage data mining to find news stories and an Italian initiative that uses data to track down dog owners who fail to clean up after their pets.

1. Call for Industry to Use Oceanographic and Atmospheric Data

The National Oceanic and Atmospheric Administration (NOAA) collects over 20 terabytes of data per day, but only a small percentage of this is easily accessible. This week, U.S. Secretary of Commerce Penny Pritzker called on private companies to partner with NOAA to unlock and make use of this data, with an eye toward creating new products and driving economic growth. The Department of Commerce has released a Request for Information, asking companies to weigh in on the feasibility of the project.

2. Data Science and Dog Waste

Officials in Naples, Italy are piloting a data-driven program to identify dog owners who fail to pick up after their pets. The program, which would collect blood samples from every dog in the city to create a database of DNA profiles, would allow the city to identify dogs by their waste. Upon identifying the offending canine, the city could send a fine to the registered owner.

3. TV Network Partners with Data Mining Startup

MSNBC is partnering with news technology startup Vocativ to mine the Internet for stories it will use in the new Ronan Farrow Daily program. Vocativ, whose founder also launched a global security firm that helps governments and corporations manage and analyze information, was originally developed to help companies identify business threats. The software that underlies Vocativ searches social media, chat rooms and other online data to predict stories’ growth potential.

4. The State of Patient Matching Technology

The Office of the National Coordinator for Health Information Technology issued a report this week on the present and future of the technology used to match patients’ electronic health records across different organizations. The report recommends that organizations making any changes to patient data attributes coordinate with other groups working on related topics and that organizations should not be required to use a specific type of patient matching algorithm.

5. Qualitative Data Repository Launches

This week saw the launch of the Qualitative Data Repository at Syracuse University’s Center for Qualitative and Multi-Method Inquiry. The repository, which was designed in response to a perceived trend in the social sciences for collecting qualitative data once and never reusing it, will allow researchers to reuse qualitative data more efficiently. The repository ensures that the data has persistent, unique identifiers and provides a library of guidance and resources to help scholars manage the data.

6. Cinema Analytics Examines Films’ Gender Gap

Cinemetrics, the statistical study of films, has been around for decades, but it has recently been applied to a new topic: the on-screen gender gap. Looking at screen time, analysts found that this year’s Oscar nominees for best actor occupied an average of 85 minutes on screen, while actresses averaged only 59 minutes. However, in several cases women had greater screen time per shot than men, lending credence to the idea that women in films are sometimes put on display for male audiences.

7. Pro Publica Launches Data Store

Investigative journalism site Pro Publica released its “Data Store” platform this week, with data the site has used in its reporting. The site will offer free downloads of information obtained by Freedom of Information Act (FOIA) requests and charge one-time fees for data that the organization has cleaned or devoted significant effort to modifying. So far, the site charges approximately $200 for journalists and $2000 for academics. The datasets are categorized into Health, Business and Transportation sectors.

8. Opening Flood Data in the UK

The UK’s Environment Agency will soon release a large quantity of flood data, including real-time river levels and flood maps. The agency, which has long contended that it needed to keep the data closed in order to collect licensing fees, recently acquiesced in the wake of accusations that it prevented people from finding out about how harsh winter weather would affect their homes. The data could enable businesses to develop local flood warning systems and other products.

9. DARPA’s Plan for a Next-Generation Decision Support System

The U.S. Department of Defense Advanced Research Projects Agency (DARPA) announced that it was launching the data-driven Distributed Battle Management program this week. It would address the increasing complexity of tech-enabled battlefield conditions with decision aid software and control algorithms. The tools will be integrated into airborne systems used by pilots and battle managers.

10. Researchers Borrow Supercomputer to Sequence Genomes

Researchers at the University of Chicago published a paper in the journal Bioinformatics this week, detailing how they used the university’s Beagle supercomputer to accelerate genomic data analysis. The researchers were able to analyze 240 whole genomes in 50 hours, a high throughput for the complex operations genomic analysis involves. The researchers hope their approach will save money, inching ever closer to the widely-held milestone of $1000 to sequence a genome.

Tags: , , , , , , , , , , , , ,


About the Author

Travis Korte is a research analyst at the Center for Data Innovation specializing in data science applications and open data. He has a background in journalism, computer science and statistics. Prior to joining the Center for Data Innovation, he launched the Science vertical of The Huffington Post and served as its Associate Editor, covering a wide range of science and technology topics. He has worked on data science projects with HuffPost and other organizations. Before this, he graduated with highest honors from the University of California, Berkeley, having studied critical theory and completed coursework in computer science and economics. His research interests are in computational social science and using data to engage with complex social systems. You can follow him on Twitter @traviskorte.



Back to Top ↑

Show Buttons
Hide Buttons