Google has released a dataset of links to 8 million YouTube videos called YouTube-8M to help advance machine learning research. Google annotated each video in YouTube-8M by their subject matter, using 4,800 different labels, which can serve as useful metadata for video interpretation programs. The videos in YouTube-8M makeup over 500,000 hours of video, which is substantially larger than a previous dataset Google released called Sports-1M, which contained one million videos, and likely the largest annotated video dataset available to researchers.
Helping Computers Understand Videos
Joshua New is a policy analyst at the Center for Data Innovation. He has a background in government affairs, policy, and communication. Prior to joining the Center for Data Innovation, Joshua graduated from American University with degrees in C.L.E.G. (Communication, Legal Institutions, Economics, and Government) and Public Communication. His research focuses on methods of promoting innovative and emerging technologies as a means of improving the economy and quality of life. Follow Joshua on Twitter @Josh_A_New.