Ethan Perez will be giving a hybrid talk at the NLP seminar on Friday, April 29, from 11am-noon PST. This talk will be held in person in South Hall 202, and Zoom information will be distributed via the Berkeley NLP Seminar listserv for those wishing to attend remotely.
Title: Aligning Language Models with Human Preferences
Abstract: Self-supervised learning objectives are highly effective at pretraining language models (LMs) for various tasks. In this talk, we first show that self-supervised objectives are misaligned with human preferences in many, important ways; LMs trained on internet text generate misinformation, offensive jokes, and personal contact information, and are highly sensitive to the conditioning text (“prompt”). Next, we show that LM-based classifiers are effective at predicting which texts humans prefer. As a result, it is possible to use such classifiers as a learning signal to automatically correct the LM. We showcase this approach to train a high-quality retrieval system, obtaining strong performance across a variety of tasks using Retrieval-Augmented Generation (RAG). Even after such training schemes, some undesirable behaviors may remain undetected during training. We thus go a step further and generate inputs that elicit undesirable behaviors from the LM using other LMs, to preemptively catch and fix such behaviors. Overall, we find that some of the most powerful tools for aligning LMs with human preferences are LMs themselves.
Bio: Ethan Perez is a fourth year Ph.D. student in Natural Language Processing at New York University. He is advised by Kyunghyun Cho and Douwe Kiela and funded by NSF and Open Philanthropy. His research aims to develop learning algorithms that overcome human shortcomings, such as social biases, cognitive biases, and misconceptions. Previously, he has spent time at DeepMind, Facebook AI Research, Montreal Institute for Learning Algorithms, and Google. He earned a Bachelor’s from Rice University as the Engineering department’s Outstanding Senior.
Katie Stasaski will be giving a hybrid talk on Friday, May 6, from 11am-noon PST. This talk will be held in person in South Hall 202, and Zoom information will be distributed via the Berkeley NLP Seminar listserv for those wishing to attend remotely.
Title: Diversity in Dialogue Generation
Abstract: Conversational dialogue models struggle to produce diverse results, often over-producing typical utterances like “Yes” or “I don’t know.” This dissertation analyzes the diversity problem and proposes ways to improve dialogue agents in both the single- and multi-response setting. In the single-response setting, I propose a novel dataset collection algorithm which uses dynamically-computed corpus statistics to determine which crowdworkers to collect more data from. This process results in significantly more diverse datasets and improves the diversity of downstream dialogue agents trained on the more diverse corpora.
In the multi-response setting, I propose a new way of measuring semantic diversity using a natural language inference model, which is highly correlated with human judgments of diversity. I also propose a decoding procedure which iteratively improves the diversity of a set of model responses, achieving higher diversity with minimal loss in relevancy. Finally, I examine the extent which speech acts constrain diversity of human-generated dialogue responses. I propose a new task in which creative writers rate the extent a conversation inspires the creation of multiple diverse responses, finding that judgments align with speech act hypotheses.
will be giving a virtual talk on Friday, April 8, from 11am-noon PST. Zoom information will be distributed via the Berkeley NLP Seminar listserv for those wishing to attend remotely.
Title: Robustifying NLP with Humans in the Loop
Abstract: Most machine learning methods address prediction problems under restrictive assumptions but when applied to drive decisions in environments where those assumptions are violated. This disconnect between what the methodological framework offers and the desired applications have caused confusion both among researchers (who often lack the right formalism to tackle these problems coherently), practitioners (who have developed a folks tradition of ad hoc practices for deploying and monitoring systems), and regulators (who have applied frameworks designed for biomedical ethics to machine learning). In this talk I’ll discuss some of these issues affecting the application of machine learning and our fledgling efforts to bridge some of these gaps by injecting causal knowledge via humans in the loop, along with some critical disconnects between how humans are employed in ML research to perform various tasks and the regulatory framework around research ethics, and its implications.
Bio: Divyansh Kaushik is a PhD Candidate at the Language Technologies Institute in the School of Computer Science at Carnegie Mellon University, and a Science and Technology Policy Fellow at the Federation of American Scientists. He is advised by Dr. Eduard Hovy and Dr. Zachary Lipton and in the Approximately Correct Machine Intelligence (ACMI) Lab. An Amazon Graduate Research Fellow, Divyansh’s interests lie in exploring human-AI interaction. Over the years, his work has been supported by Amazon AI, Pricewaterhouse Coopers, and Facebook AI. He is also the President of CMU’s Graduate Student Assembly and has written on several science policy issues (recently appearing in Forbes, Institute for Progress, Issues in Science and Technology and PublicSource).