Home BlogWeekly News 10 Bits: The Data News Hot List

10 Bits: The Data News Hot List

by Travis Korte
by
RAW, a new user-friendly data visualization application

This week’s list of data news highlights covers October 5-11 and includes articles on GE’s ‘industrial internet’ offerings and a new method for remotely detecting human rights violations.

1. GE Bets Big on ‘Industrial Internet’

General Electric announced this week that it would expand its push into the industrial internet, with 14 more products that will be equipped with internet-linked sensors and performance management software. The products span the fields of aviation, energy, transportation and healthcare; one offering is a power generation product designed to optimize turbine usage. GE has already deployed 10 industrial devices equipped with sensors and performance tracking tools; the company hopes these enhancements will help it better predict products’ failure and design future products more cheaply.

2. Microsoft Advisor: Data Misuse Should be a Felony

Craig Mundie, Microsoft’s senior advisor to the CEO, argued this week that it is impossible to control the collection and retention of data in our current environment. Instead, he suggested that additional metadata and cryptographic wrappers could be added to personal data to limit its uses. Combining technological efforts with strong legal penalties could deter misuse.

3. Remote Sensing to Detect Human Rights Violations

The U.S. Holocaust Memorial Museum (USHMM) and the U.S. Department of State have proposed a new approach to detecting mass human rights violations with satellite imagery. Using publicly-available NASA imagery data, researchers tested the hypothesis that smoke from burning villages would change the surrounding area’s reflectivity in a manner that could be automatically detected by satellite. The methodology was applied to a database of destroyed villages around Darfur, Sudan, and offered the most accurate picture to date of when these villages were destroyed.

4. Biologists Struggle to Keep Up With Flood of Data

Advances in genomics, biomedical imaging and other technologies have signaled a shift in the biological data landscape, and biologists are sprinting to keep up with the large amounts of newly available data. Big Science initiatives like the Human Genome Project and the human microbiome—an effort to map the influence of bacteria on human growth, development and disease—have forced biologists to use new hardware and learn new skills for storing, wrangling and analyzing large datasets. Moreover, funding for these technological advances is often overlooked. Although this creates costs for the biomedical community in the short term, it also presents considerable opportunity: understanding the microbial environment in human bodies could lead to new insights into treating obesity, allergies, Crohn’s disease and other disorders.

5. Repurposing Cell Phone Sensors for Weather Forecasting

Earlier this year, the data scientists at OpenSignal, an open mapping project that documents global cell phone signal coverage, launched an initiative called WeatherSignal to collect weather data from smartphones. The creators hope that smartphones will be a major source of weather forecasting data in the future, and their initiative has the potential to decrease forecasting costs through the use of pre-existing sensors. The initiative repurposes sensors built into Android devices that collect data on humidity, barometric pressure, temperature and light intensity. The project, while still in the proof-of-concept stage, has already spurred the creation of a number of new applied data science techniques, such as converting battery temperature readings to ambient temperature.

6. Building Better Nonprofits with Big Data

Nonprofit organizations could increase their effectiveness considerably by making better use of data science, but that will require a meaningful, sustained effort. Rayid Ghani, former chief scientist at Obama for America, is helping lead the charge with the University of Chicago’s Data Science for Social Good fellowship and a startup called Edgeflip that helps nonprofits conduct better donor targeting. One way nonprofits can get with the program, Ghani says, is by sharing data among organizations solving similar issues. However, a large foundation or a consortium of large nonprofits could help accelerate this process by pooling resources on a common data-sharing platform.

7. Complex Data Calls for New Mathematics

With all the technology that has emerged in recent years to help analyze big data, it is easy to overlook the fundamental work mathematicians are doing to simplify large datasets before they reach the processor. Stanford researchers are developing two techniques in particular that help simplify big data so that it can be analyzed with less computing power. One takes a geometric approach, attempting to represent large data sets using networks and then reduce them down to smaller and more tractable networks that preserve most of the same geometric properties. Another takes a similar approach but treats the data as a signal, attempting to compress the data like a digital audio file so that most of the information can still be recovered while taking up considerably less space.

8. The Science Case for Open Data

A study published in open access journal PeerJ last week showed that papers with publicly available data are 9 percent more likely to be cited than papers that do not make their data available. The study’s authors looked at more than 10,000 genomics papers, and controlled for such factors as publication date, journal impact factor and open access status. The researchers also found that around 20 percent of datasets released between 2003 and 2007 had been reused at least once by other researchers.

9. Vermont’s Open Data Pilot Project

Vermont’s director of web services announced this week that his state would begin pursuing an open data initiative, beginning with the release of 10 data sets on a web-based portal. The announcement came during the Vermont Open Data Summit, and marks a welcome change for the state; Vermont earned a D+ for public access to information in a 2012 nationwide survey of state transparency. Another presenter at the summit noted that the State of Massachusetts estimated that it saved $2 million in 2012 by allowing the public to access information without needing to submit a FOIA request; although Vermont is only one-tenth as populous as Massachusetts, advocates expect open data efforts to drive savings in Vermont as well.

10. Open-Source Visualization Tool for the Masses

Designers at the Polytechnic University of Milan have created a user-friendly web tool for creating attractive, scalable data visualizations. The utility democratizes visualization, allowing users to import data from popular platforms such as Microsoft Excel and create charts and cluster analyses with only a minimum of setup. The project, which is still only an alpha release, is open-source, and the creators invite modifications and extensions.

You may also like

Show Buttons
Hide Buttons