Maarten Sap is giving a talk on Friday, Feb 7 at 11am – 12pm in South Hall 210.
Title: Reasoning about Social Dynamics in Language
Abstract: Humans reasons about social dynamics when navigating everyday situations. Due to limited expressivity of existing NLP approaches, reasoning about the biased and harmful social dynamics in language remains a challenge, and can backfire against certain populations.
In the first part of the talk, I will analyze a failure case of NLP systems, namely, racial bias in automatic hate speech detection. We uncover severe racial skews in training corpora, and show that models trained on hate speech corpora acquire and propagate these racial biases. This results in tweets by self-identified African Americans being up to two times more likely to be labelled as offensive compared to others. We propose ways to reduce these biases, by making a tweet’s dialect more explicit during the annotation process.
Then, I will introduce Social Bias Frames, a conceptual formalism that models the pragmatic frames in which people project social biases and stereotypes onto others to reason about biased or harmful implications in language. Using a new corpus of 150k structured annotations, we show that models can learn to reason about high-level offensiveness of statements, but struggle to explain why a statement might be harmful. I will conclude with future directions for better reasoning about biased social dynamics.
Bio: Maarten Sap is a 5th year PhD student at the University of Washington advised by Noah Smith and Yejin Choi. He is interested in natural language processing (NLP) for social understanding; specifically in understanding how NLP can help us understand human behavior, and how we can endow NLP systems with social intelligence, social commonsense, or theory of mind. He’s interned on project Mosaic at AI2, working on social commonsense for artificial intelligence systems, and at Microsoft Research working on long-term memory and storytelling with Eric Horvitz.
Yonatan Belinkov is giving a talk on Friday, Jan 31 at 11am – 12pm in South Hall 210.
Title: Causal Mediation Analysis for Interpreting Neural NLP: The Case of Gender Bias
Abstract: The success of neural network models in various tasks, coupled with their opaque nature, has led to much interest in interpreting and analyzing such models. Common analysis methods for interpreting neural models in natural language processing typically examine either their structure (for example, probing classifiers) or their behavior (challenge sets, saliency methods), but not both. In this talk, I will propose a new methodology grounded in the theory of causal mediation analysis for interpreting which parts of a model are causally implicated in its behavior. This methodology enables us to analyze the mechanisms by which information flows from input to output through various model components, known as mediators. I will demonstrate an application of this methodology to analyzing gender bias in pre-trained Transformer language models. In particular, we study the role of individual neurons and attention heads in mediating gender bias across three datasets designed to gauge a model’s sensitivity to gender bias. Our mediation analysis reveals that gender bias effects are (i) sparse, concentrated in a small part of the network; (ii) synergistic, amplified or repressed by different components; and (iii) de-composable into effects flowing directly from the input and indirectly through the mediators. I will conclude by laying out a few ideas for future work on analyzing neural NLP models.
Bio: Yonatan Belinkov is a Postdoctoral Fellow at the Harvard School of Engineering and Applied Sciences (SEAS) and the MIT Computer Science and Artificial Intelligence Laboratory (CSAIL). His research focuses on interpretability and robustness of neural network models of human language. His research has been published at various NLP/ML venues. His PhD dissertation at MIT analyzed internal language representations in deep learning models, with applications to machine translation and speech recognition. He is a Harvard Mind, Brain, and Behavior Fellow. He will be joining the Technion Computer Science department in Fall 2020.
Professor Tal Linzen is giving a talk on Thursday, Jan 23 at 11am – 12pm in South Hall 202.
Title: Syntactic generalization in natural language inference
Abstract: Neural network models for natural language processing often perform very well on examples that are drawn from the same distribution as the training set. Do they accomplish such success by learning to solve the task as a human might solve it, or do they adopt heuristics that happen to work well on the data set in question, but do not reflect the normative definition of the task (how one “should” solve the task)? This question can be addressed effectively by testing how the system generalizes to examples constructed specifically to diagnose whether the system relies on such fallible heuristics. In my talk, I will discuss ongoing work applying this methodology to the natural language inference (NLI) task.
I will show that a standard neural model — BERT fine-tuned on the MNLI corpus — achieves high accuracy on the MNLI test set, but shows little sensitivity to syntactic structure when tested on our diagnostic data set (HANS); instead, the model relies on word overlap between the premise and the hypothesis, and concludes, for example, that “the doctor visited the lawyer” entails “the lawyer visited the doctor”. While accuracy on the test set is very stable across fine-tuning runs with different weight initializations, generalization behavior varies widely, with accuracy on some classes of examples ranging from 0% to 66%. Finally, augmenting the training set with a moderate number of examples that contradict the word overlap heuristic leads to a dramatic improvement in generalization accuracy. This improvement generalizes to constructions that were not included in the augmentation set. Overall, our results suggest that the syntactic deficiencies of the fine-tuned model do not arise primarily from poor abstract syntactic representations in the underlying BERT model; rather, because of its weak inductive bias, BERT requires a strong fine-tuning signal to favor those syntactic representations over simpler heuristics.
Bio: Tal Linzen is an Assistant Professor of Cognitive Science (with a joint appointment in Computer Science) at Johns Hopkins University, and affiliated faculty at the JHU Center for Language and Speech Processing. Before moving to Johns Hopkins, he was a postdoctoral researcher at the École Normale Supérieure in Paris, and before that he obtained his PhD from the Department of Linguistics at New York University. At Johns Hopkins, Tal directs the Computation and Psycholinguistics Lab, which develops computational models of human language comprehension and acquisition, as well as psycholinguistically-informed methods for interpreting, evaluating and extending neural network models for natural language processing. The lab’s work has appeared in venues such as ACL, CoNLL, EMNLP, ICLR, NAACL and TACL, as well as in journals such as Cognitive Science and Journal of Neuroscience. Tal co-organized the first two editions of the BlackboxNLP Workshop on Analyzing and Interpreting Neural Networks for NLP (EMNLP 2018, ACL 2019) and is a co-chair of CoNLL 2020.