University of Texas professor Danna Gurari and her colleagues have published a dataset of approximately 31,000 images, plus questions and answers about the contents of each image. The dataset is intended to serve as training data for computer vision applications that could help people blind or visually impaired interpret images. The data comes from an app called VizWiz that allows users to take pictures with their smartphones and ask volunteer interpreters questions about the image, such as the cost of an item in a store. Each image in the dataset includes a transcription of the question a VizWiz user asked about it and 10 crowdsourced answers from Amazon Mechanical Turk workers.
Training Virtual Assistants for People Who Are Blind
Michael McLaughlin is a research assistant at the Center for Data Innovation. He previously worked at Oracle and held internships at USA TODAY and in local government. Prior to joining the Center for Data Innovation, Michael graduated from Wake Forest University, where he majored in Communication with Minors in Politics and International Affairs and Journalism. He is currently pursuing his Master’s in Communication at Stanford University, specializing in Data Journalism.
View all posts by Michael McLaughlin