Month: October 2022

Jeff Wu from OpenAI will be giving a talk at the Berkeley NLP seminar.

Time: Oct 21 from 11am-12pm PST

Location: South Hall 210

Title: Training models to critique themselves

Abstract: We study the setting of large language models critiquing themselves in natural language. We find that:
  1. Critiques help humans find flaws in summaries that they would have otherwise missed.
  2. Larger models write more helpful critiques, and on most tasks are better at self-critiquing.
  3. Larger models can use their own self-critiques, refining their own summaries into better ones.
  4. We suggest methodology for and find evidence that our models’ critiques may not be able to surface all its relevant knowledge of flaws.

Bio: Jeff Wu is a research engineer at OpenAI working on language modeling (e.g. GPT-2) and alignment (InstructGPT).

Alex Tamkin will be giving a hybrid talk at the NLP Seminar on Friday, Oct 14 from 11am-12pm PST. This talk will be held in person in South Hall 210.

Title: Self-Supervised Learning for the Real World

Abstract: Spearheaded by advances in NLP, machine learning is undergoing a transformative shift towards large, generalist models trained with self-supervised learning (SSL). In this talk, I’ll discuss two challenges lying ahead for this paradigm, as well as some paths towards surmounting them. First, I’ll discuss the problem of task ambiguity. While the space of tasks that models can perform is expanding rapidly, the number of bits (e.g. examples) used to specify the task is shrinking. Given these two opposing forces, how do we ensure that models learn the tasks we intend? I’ll discuss how we can measure the effects of such task ambiguity on humans and language models, as well as work showing how two-way interaction between users and large models can make strides on this problem in NLP and computer vision. Second, I’ll discuss the challenge of domain-agnostic SSL, necessary for realizing the benefits of SSL in high-impact settings such as healthcare, the sciences, and engineering. I’ll present DABS, a novel kind of Domain-Agnostic Benchmark for SSL algorithms, covering data from 12 different fields (e.g. text, genomics, wearable sensors, and particle physics). With DABS, we develop and present the first SSL methods which succeed on such a broad range of modalities.

Relevant papers:
– Active Learning Helps Pretrained Models Learn the Intended Task
– DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning
– DABS 2.0: Improved Datasets and Algorithms for Universal Self Supervision
– Viewmaker Networks: Learning Views for Unsupervised Representation Learning

Bio: Alex Tamkin is a fifth-year PhD student in Computer Science at Stanford, advised by Noah Goodman and part of the Stanford NLP Group. His research focuses on self-supervised learning, especially in multimodal and domain-general settings. He is a recipient of the Open Philanthropy AI Fellowship.

Arya McCarthy gave a hybrid talk at the NLP Seminar on Friday, Sep 30 from 11am-12pm PST. This talk was held in person in South Hall 202.

Title: Kilolanguage Learning, Projection, and Translation

Abstract: The breadth of information digitized in the world’s languages gives opportunities for linguistic insights and computational tools with pan-lingual perspective. We can achieve this by projecting lexical information across language, either at the type or token level. First, we project information between thousands of languages at the type level to investigate the classic color word hypotheses of Berlin and Kay. Applying fourteen computational linguistic measures of color word basicness/secondariness, we find cross-linguistic credence and shed additional nuance. Second, we project information between thousands of languages at the token level to create fine-grained morphological analyzers and generators. We show applications to pronoun clusivity and multilingual MT. Finally, we produce morphological tools grounded in UniMorph that improve on strong initial models and generalize across languages.

Bio: Arya McCarthy is a Ph.D. candidate at Johns Hopkins University, working on massively multilingual natural language processing. He is advised by David Yarowsky in the Center for Language and Speech Processing; his work is funded by DARPA LORELEI, the International Olympic Committee, and the American Political Science Association (APSA). His work focuses on improving translation and computational modeling of rare languages. Primarily, he approaches this through weakly supervised natural language processing at the scale of 1000s of languages. Previously, Arya has spent time at Google, Duolingo, Facebook, Harvard University, and the University of Edinburgh. Arya is the PI for an APSA grant geared toward better integrating computational and social sciences. In this effort, he is partnering with Tom Lippincott, Kathy McKeown, David Mimno, Philip Resnik, and Noah Smith.