Author: David Bamman

Ethan Perez will be giving a hybrid talk at the NLP seminar on Friday, April 29, from 11am-noon PST. This talk will be held in person in South Hall 202, and Zoom information will be distributed via the Berkeley NLP Seminar listserv for those wishing to attend remotely.

Title: Aligning Language Models with Human Preferences

Abstract: Self-supervised learning objectives are highly effective at pretraining language models (LMs) for various tasks. In this talk, we first show that self-supervised objectives are misaligned with human preferences in many, important ways; LMs trained on internet text generate misinformation, offensive jokes, and personal contact information, and are highly sensitive to the conditioning text (“prompt”). Next, we show that LM-based classifiers are effective at predicting which texts humans prefer. As a result, it is possible to use such classifiers as a learning signal to automatically correct the LM. We showcase this approach to train a high-quality retrieval system, obtaining strong performance across a variety of tasks using Retrieval-Augmented Generation (RAG). Even after such training schemes, some undesirable behaviors may remain undetected during training. We thus go a step further and generate inputs that elicit undesirable behaviors from the LM using other LMs, to preemptively catch and fix such behaviors. Overall, we find that some of the most powerful tools for aligning LMs with human preferences are LMs themselves.

Bio: Ethan Perez is a fourth year Ph.D. student in Natural Language Processing at New York University. He is advised by Kyunghyun Cho and Douwe Kiela and funded by NSF and Open Philanthropy. His research aims to develop learning algorithms that overcome human shortcomings, such as social biases, cognitive biases, and misconceptions. Previously, he has spent time at DeepMind, Facebook AI Research, Montreal Institute for Learning Algorithms, and Google. He earned a Bachelor’s from Rice University as the Engineering department’s Outstanding Senior.

Katie Stasaski will be giving a hybrid talk on Friday, May 6, from 11am-noon PST. This talk will be held in person in South Hall 202, and Zoom information will be distributed via the Berkeley NLP Seminar listserv for those wishing to attend remotely.

Title: Diversity in Dialogue Generation

Abstract: Conversational dialogue models struggle to produce diverse results, often over-producing typical utterances like “Yes” or “I don’t know.” This dissertation analyzes the diversity problem and proposes ways to improve dialogue agents in both the single- and multi-response setting. In the single-response setting, I propose a novel dataset collection algorithm which uses dynamically-computed corpus statistics to determine which crowdworkers to collect more data from. This process results in significantly more diverse datasets and improves the diversity of downstream dialogue agents trained on the more diverse corpora.

In the multi-response setting, I propose a new way of measuring semantic diversity using a natural language inference model, which is highly correlated with human judgments of diversity. I also propose a decoding procedure which iteratively improves the diversity of a set of model responses, achieving higher diversity with minimal loss in relevancy. Finally, I examine the extent which speech acts constrain diversity of human-generated dialogue responses. I propose a new task in which creative writers rate the extent a conversation inspires the creation of multiple diverse responses, finding that judgments align with speech act hypotheses.

Divyansh Kaushik will be giving a virtual talk on Friday, April 8, from 11am-noon PST.   Zoom information will be distributed via the Berkeley NLP Seminar listserv for those wishing to attend remotely.

Title: Robustifying NLP with Humans in the Loop

Abstract: Most machine learning methods address prediction problems under restrictive assumptions but when applied to drive decisions in environments where those assumptions are violated. This disconnect between what the methodological framework offers and the desired applications have caused confusion both among researchers (who often lack the right formalism to tackle these problems coherently), practitioners (who have developed a folks tradition of ad hoc practices for deploying and monitoring systems), and regulators (who have applied frameworks designed for biomedical ethics to machine learning). In this talk I’ll discuss some of these issues affecting the application of machine learning and our fledgling efforts to bridge some of these gaps by injecting causal knowledge via humans in the loop, along with some critical disconnects between how humans are employed in ML research to perform various tasks and the regulatory framework around research ethics, and its implications.

Bio: Divyansh Kaushik is a PhD Candidate at the Language Technologies Institute in the School of Computer Science at Carnegie Mellon University, and a Science and Technology Policy Fellow at the Federation of American Scientists. He is advised by Dr. Eduard Hovy and Dr. Zachary Lipton and in the Approximately Correct Machine Intelligence (ACMI) Lab. An Amazon Graduate Research Fellow, Divyansh’s interests lie in exploring human-AI interaction. Over the years, his work has been supported by Amazon AI, Pricewaterhouse Coopers, and Facebook AI. He is also the President of CMU’s Graduate Student Assembly and has written on several science policy issues (recently appearing in Forbes, Institute for Progress, Issues in Science and Technology and PublicSource).

Gašper Beguš will be giving a hybrid talk on Friday, April 1, from 11am-noon PST.   This talk will be held in person in South Hall 202, and Zoom information will be distributed via the Berkeley NLP Seminar listserv for those wishing to attend remotely.

Title: Cognitive modeling, neural network interpretability, and GANs

Abstract:In this talk, I propose that language can be modeled from raw speech data in a fully unsupervised manner with Generative Adversarial Networks (GANs) and that such modeling has implications both for the understanding of language acquisition and for the understanding of how deep neural networks learn internal representations. I propose a technique that allows us to “wug-test” neural networks trained on raw speech, analyze intermediate convolutional layers, and test a causal relationship between meaningful units in the output and latent/intermediate representations. I further propose an extension of the GAN architecture in which learning of meaningful linguistic units emerges from a requirement that the networks output informative data and includes both the perception and production principles. With this model, we can test what the networks can and cannot learn, how their biases match human learning biases in behavioral experiments, how speech processing in the brain compares to intermediate representations in deep neural networks (by comparing acoustic properties in intermediate convolutional layers and the brainstem), how symbolic-like rule-like computation emerges in internal representations, and what GAN’s innovative outputs can teach us about productivity in human language. This talk also makes a more general case for probing deep neural networks with raw speech data, as dependencies in speech are often better understood than those in the visual domain and because behavioral data on speech (especially the production aspect) are relatively easily accessible.

Bio: Gašper Beguš an Assistant Professor at the Department of Linguistics at UC Berkeley where he directs the Berkeley Speech and Computation Lab. Before coming to Berkeley, he was an Assistant Professor at the University of Washington and before that he graduated with a Ph.D. from Harvard. His research focuses on developing deep learning models for speech data. More specifically, he trains models to learn representations of spoken words from raw audio inputs. He combines machine learning and statistical modeling with neuroimaging and behavioral experiments to better understand how neural networks learn internal representations in speech and how humans learn to speak.

Nasrin Mostafazadeh will be giving a hybrid talk on Friday, March 4, from 11am-noon PST.   This talk will be held in person in South Hall 202, and Zoom information will be distributed via the Berkeley NLP Seminar listserv for those wishing to attend remotely.

Title: How far have we come in giving our NLU systems common sense?

Abstract: Commonsense reasoning has been a long-established area in AI for more than three decades. Despite the lack of much ongoing effort in this area after the 80s, in the past few years, there has been a renewed interest in the AI community for giving machines common sense–acknowledging it as the holy grail of AI and one of the bottlenecks in deploying AI systems in the real world. With the tremendous recent progress in natural language understanding (NLU), the lack of commonsense reasoning capabilities of NLU systems is more evident than ever. In this talk, I’ll discuss the amazing recent progress made in tackling commonsense reasoning benchmarks using the pre-trained neural models. I’ll talk about the role of benchmarks in measuring our progress and how we can move the goal post towards constructing coherent mental models of narratives.

Bio: Nasrin is Co-founder of Verneek, a deep-tech startup in NYC (in stealth). Verneek’s mission is to enable anyone to make better and faster decisions anywhere, using intuitive modalities of interaction that are powered through innovative AI technologies. Before Verneek, Nasrin held research positions at AI startups and big tech companies ranging from BenevolentAI to Microsoft Research. She received her PhD at the University of Rochester, working at the conversational interaction and dialogue research group, with her PhD work focused on commonsense reasoning through the lens of narratives. She has started lines of research that push AI toward a deeper understanding of the world, currently being further developed into the core technologies at Verneek. She has been a keynote speaker, chair, organizer, and program committee member at different AI events. Nasrin was named to Forbes’ 30 Under 30 in Science in 2019 for her work in AI.

Dora Demszky will be giving a virtual talk on Friday, January 14, from 11am-noon PST. Zoom information will be distributed via the Berkeley NLP Seminar listserv.

Title: Using Natural Language Processing to Support Equitable and Student-Centered Education

Abstract: Providing consistent, individualized feedback to teachers is essential for improving instruction but can be prohibitively resource-intensive in most educational contexts. I demonstrate ways in which natural language processing (NLP) can be used to address this gap and provide teachers with feedback in a scalable and effective way. As part of a case study, I introduce an automated tool based on NLP that provides teachers with feedback on their uptake of student contributions, a high-leverage teaching practice that supports dialogic instruction and makes students feel heard. This tool is based on our fully automated measure of uptake that we validate extensively by analyzing the linguistic phenomena it captures, such as questioning and elaboration, and by demonstrating its correlation with positive educational outcomes across three datasets of student-teacher interaction. We evaluate the effectiveness of our tool to improve teachers’ uptake of student contributions by conducting a randomized controlled trial in an online computer science course, Code in Place (n=1,136 instructors). We find that the tool improves instructors’ uptake of student contributions by 24% and present suggestive evidence that our tool also improves students’ satisfaction with the course. These results demonstrate the promise of our tool to complement existing efforts in teachers’ professional development.

Bio: Dora is a PhD candidate in Linguistics at Stanford, advised by Dan Jurafsky. She works on developing natural language processing methods to support equitable and student-centered education. Her recent publications focus on analyzing the representation of historically marginalized groups in US history textbooks and on measuring and giving feedback to teachers on their uptake of student contributions in classrooms. Prior to her PhD, Dora received a BA summa cum laude from Princeton University in Linguistics with a minor in Computer Science.

Maria Antoniak will be giving a virtual talk on Friday, December 3, from 11am-noon. Zoom information will be distributed via the Berkeley NLP Seminar listserv.

Title: Modeling Personal Experiences Shared in Online Communities

Abstract: Written communications about personal experiences—and the emotions, narratives, and values that they contain—can be both rhetorically powerful and statistically difficult to model. My research uses natural language processing methods to represent complex personal experiences and self-disclosures communicated in online communities. Two fruitful sites for this research are online communities grounded in structured cultural experiences (books, games) and online communities grounded in healthcare experiences (childbirth, contraception, pain management). These communities situate personal opinions and stories in social contexts of reception, expectation, and judgment. In two case studies, I’ll show how quantifying textual patterns reveals community reframings and redefinitions of established narratives and hierarchies.

Bio: Maria Antoniak is a PhD candidate in Information Science at Cornell University. Her research focuses on unsupervised natural language processing methods and applications to computational social science and cultural analytics. Her work translates methods from natural language processing to insights about communities and self-disclosure by modeling personal experiences shared in online communities. She has a master’s degree in computational linguistics from the University of Washington and a bachelor’s degree in humanities from the University of Notre Dame, and she has completed research internships at Microsoft, Facebook, Twitter, and Pacific Northwest National Laboratory.