Ebola Data, Machine-Readable at Last
Doctors and aid organizations have complained that poor quality data on the West African Ebola outbreak has made their jobs more difficult. But while it is true that much of the detailed case data they might expect to work with in a western country is unavailable, the affected countries do release some useful data, including information on confirmed cases by administrative district. The reason this data has not been useful for analysis, however, is that the countries in question—Liberia, Sierra Leone, and Guinea—report this data in portable document format (PDF), a format that is not machine-readable. Caitlin Rivers, a PhD student in computational epidemiology at Virginia Tech, has now endeavored to digitize these records by hand, posting the data from Liberia and Sierra Leone on her GitHub account, and promising to post the Guinea data soon. Rivers is also posting a blog series analyzing and visualizing the data and hopes the data will help other researchers do the same.