Stanford University researchers have released the Conversational Question Answering (CoQA) dataset to help machines better gather and provide information in conversations with humans. The dataset includes 127,000 questions from 8,000 different conversations. These conversations are from seven different types of text, including children’s stories, high school English exams, and Reddit. AI models often struggle to answer questions across different domains (i.e. news stories vs. English exams), and the researchers found that humans significantly outperformed reading comprehension models in answering the questions.
Helping Machines Be Conversational
Michael McLaughlin is a research assistant at the Center for Data Innovation. He previously worked at Oracle and held internships at USA TODAY and in local government. Prior to joining the Center for Data Innovation, Michael graduated from Wake Forest University, where he majored in Communication with Minors in Politics and International Affairs and Journalism. He is currently pursuing his Master’s in Communication at Stanford University, specializing in Data Journalism.
View all posts by Michael McLaughlin