Reddit user Stuck_In_The_Matrix has compiled a dataset of every comment made on the popular forum from October 2007 through May 2015—all 1.65 billion of them. Data on each comment includes the comment’s score, author, timestamp, location on the site, and other information available through Reddit’s application program interface. Fellow Reddit users say this dataset is valuable fodder for a wide variety of research, such as analyzing how the site’s users discuss topics over time and modeling the conversations.
Every Reddit Comment from the Last Eight Years
Joshua New is a senior policy analyst at the Center for Data Innovation. He has a background in government affairs, policy, and communication. Prior to joining the Center for Data Innovation, Joshua graduated from American University with degrees in C.L.E.G. (Communication, Legal Institutions, Economics, and Government) and Public Communication. His research focuses on methods of promoting innovative and emerging technologies as a means of improving the economy and quality of life. Follow Joshua on Twitter @Josh_A_New.
View all posts by Joshua New