Data Set Reddit

Published on July 16th, 2015 | by Joshua New


Every Reddit Comment from the Last Eight Years

Reddit user Stuck_In_The_Matrix has compiled a dataset of every comment made on the popular forum from October 2007 through May 2015—all 1.65 billion of them. Data on each comment includes the comment’s score, author, timestamp, location on the site, and other information available through Reddit’s application program interface. Fellow Reddit users say this dataset is valuable fodder for a wide variety of research, such as analyzing how the site’s users discuss topics over time and modeling the conversations.

Get the data.

Tags: , ,

About the Author

Joshua New is a policy analyst at the Center for Data Innovation. He has a background in government affairs, policy, and communication. Prior to joining the Center for Data Innovation, Joshua graduated from American University with degrees in C.L.E.G. (Communication, Legal Institutions, Economics, and Government) and Public Communication. His research focuses on methods of promoting innovative and emerging technologies as a means of improving the economy and quality of life. Follow Joshua on Twitter @Josh_A_New.

Leave a Reply

Your email address will not be published. Required fields are marked *

Back to Top ↑

Show Buttons
Hide Buttons