University of Texas professor Danna Gurari and her colleagues have published a dataset of approximately 31,000 images, plus questions and answers about the contents of each image. The dataset is intended to serve as training data for computer vision applications that could help people blind or visually impaired interpret images. The data comes from an app called VizWiz that allows users to take pictures with their smartphones and ask volunteer interpreters questions about the image, such as the cost of an item in a store. Each image in the dataset includes a transcription of the question a VizWiz user asked about it and 10 crowdsourced answers from Amazon Mechanical Turk workers.
Training Virtual Assistants for People Who Are Blind
Michael McLaughlin is a research assistant at the Center for Data Innovation. He researches and writes about a variety of issues related to information technology and Internet policy, including digital platforms, e-government, and artificial intelligence. Michael graduated from Wake Forest University, where he majored in Communication with Minors in Politics and International Affairs and Journalism. He received his Master’s in Communication at Stanford University, specializing in Data Journalism.
View all posts by Michael McLaughlin