Data Set Wikimedia

Published on February 16th, 2017 | by


Studying How to Make Wikipedia Less Toxic

Researchers working for Wikimedia’s Wikipedia Detox project, which focuses on reducing the impact of harassment and attacks on the Wikipedia editor community, have published a dataset of more than 100,000 comments from English-language Wikipedia pages, annotated with information about whether or not a comment included a personal attack. The researchers collected the data to help develop methods that combine crowdsourced analysis and machine learning to automatically detect personal attacks on the site.

Get the data.

Image: Wikimedia

Tags: , , ,

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to Top ↑

Show Buttons
Hide Buttons