Katie Stasaski will be giving a hybrid talk on Friday, May 6, from 11am-noon PST. This talk will be held in person in South Hall 202, and Zoom information will be distributed via the Berkeley NLP Seminar listserv for those wishing to attend remotely.
Title: Diversity in Dialogue Generation
Abstract: Conversational dialogue models struggle to produce diverse results, often over-producing typical utterances like “Yes” or “I don’t know.” This dissertation analyzes the diversity problem and proposes ways to improve dialogue agents in both the single- and multi-response setting. In the single-response setting, I propose a novel dataset collection algorithm which uses dynamically-computed corpus statistics to determine which crowdworkers to collect more data from. This process results in significantly more diverse datasets and improves the diversity of downstream dialogue agents trained on the more diverse corpora.
In the multi-response setting, I propose a new way of measuring semantic diversity using a natural language inference model, which is highly correlated with human judgments of diversity. I also propose a decoding procedure which iteratively improves the diversity of a set of model responses, achieving higher diversity with minimal loss in relevancy. Finally, I examine the extent which speech acts constrain diversity of human-generated dialogue responses. I propose a new task in which creative writers rate the extent a conversation inspires the creation of multiple diverse responses, finding that judgments align with speech act hypotheses.