Time: Dec 9 from 11am-12pm PST
Location: In person in Berkeley Way West 8th floor rest area (8006)
Talk title: Emergence and reasoning in large language models
Abstract: This talk will cover two ideas in large language models—emergence and reasoning. Emergent abilities in large language models are abilities that are not present in small models but are present in large models. The existence of emergent abilities implies that further scaling may lead to language models with even more new abilities. Reasoning is key to long-standing challenges in machine learning such as learning from only a few examples or from abstract instructions. Large language models have shown impressive reasoning abilities simply via chain-of-thought prompting, which encourages models to generate intermediate reasoning steps before giving the final answer.
Bio: Jason Wei is a senior research scientist at Google Brain. His work centers around three aspects of large language models: instruction finetuning, chain-of-thought prompting, and emergent abilities. He was previously in the AI residency program at Google, and before that he graduated from Dartmouth College.
Jeff Wu from OpenAI will be giving a talk at the Berkeley NLP seminar.
Time: Oct 21 from 11am-12pm PST
Location: South Hall 210
Title: Training models to critique themselves
Abstract: We study the setting of large language models critiquing themselves in natural language. We find that:
- Critiques help humans find flaws in summaries that they would have otherwise missed.
- Larger models write more helpful critiques, and on most tasks are better at self-critiquing.
- Larger models can use their own self-critiques, refining their own summaries into better ones.
- We suggest methodology for and find evidence that our models’ critiques may not be able to surface all its relevant knowledge of flaws.
Bio: Jeff Wu is a research engineer at OpenAI working on language modeling (e.g. GPT-2) and alignment (InstructGPT).
Alex Tamkin will be giving a hybrid talk at the NLP Seminar on Friday, Oct 14 from 11am-12pm PST. This talk will be held in person in South Hall 210.
Title: Self-Supervised Learning for the Real World
Abstract: Spearheaded by advances in NLP, machine learning is undergoing a transformative shift towards large, generalist models trained with self-supervised learning (SSL). In this talk, I’ll discuss two challenges lying ahead for this paradigm, as well as some paths towards surmounting them. First, I’ll discuss the problem of task ambiguity. While the space of tasks that models can perform is expanding rapidly, the number of bits (e.g. examples) used to specify the task is shrinking. Given these two opposing forces, how do we ensure that models learn the tasks we intend? I’ll discuss how we can measure the effects of such task ambiguity on humans and language models, as well as work showing how two-way interaction between users and large models can make strides on this problem in NLP and computer vision. Second, I’ll discuss the challenge of domain-agnostic SSL, necessary for realizing the benefits of SSL in high-impact settings such as healthcare, the sciences, and engineering. I’ll present DABS, a novel kind of Domain-Agnostic Benchmark for SSL algorithms, covering data from 12 different fields (e.g. text, genomics, wearable sensors, and particle physics). With DABS, we develop and present the first SSL methods which succeed on such a broad range of modalities.
– Active Learning Helps Pretrained Models Learn the Intended Task
– DABS: A Domain-Agnostic Benchmark for Self-Supervised Learning
– DABS 2.0: Improved Datasets and Algorithms for Universal Self Supervision
– Viewmaker Networks: Learning Views for Unsupervised Representation Learning
Bio: Alex Tamkin is a fifth-year PhD student in Computer Science at Stanford, advised by Noah Goodman and part of the Stanford NLP Group. His research focuses on self-supervised learning, especially in multimodal and domain-general settings. He is a recipient of the Open Philanthropy AI Fellowship.
Arya McCarthy gave a hybrid talk at the NLP Seminar on Friday, Sep 30 from 11am-12pm PST. This talk was held in person in South Hall 202.
Title: Kilolanguage Learning, Projection, and Translation
Abstract: The breadth of information digitized in the world’s languages gives opportunities for linguistic insights and computational tools with pan-lingual perspective. We can achieve this by projecting lexical information across language, either at the type or token level. First, we project information between thousands of languages at the type level to investigate the classic color word hypotheses of Berlin and Kay. Applying fourteen computational linguistic measures of color word basicness/secondariness, we find cross-linguistic credence and shed additional nuance. Second, we project information between thousands of languages at the token level to create fine-grained morphological analyzers and generators. We show applications to pronoun clusivity and multilingual MT. Finally, we produce morphological tools grounded in UniMorph that improve on strong initial models and generalize across languages.
Bio: Arya McCarthy is a Ph.D. candidate at Johns Hopkins University, working on massively multilingual natural language processing. He is advised by David Yarowsky in the Center for Language and Speech Processing; his work is funded by DARPA LORELEI, the International Olympic Committee, and the American Political Science Association (APSA). His work focuses on improving translation and computational modeling of rare languages. Primarily, he approaches this through weakly supervised natural language processing at the scale of 1000s of languages. Previously, Arya has spent time at Google, Duolingo, Facebook, Harvard University, and the University of Edinburgh. Arya is the PI for an APSA grant geared toward better integrating computational and social sciences. In this effort, he is partnering with Tom Lippincott, Kathy McKeown, David Mimno, Philip Resnik, and Noah Smith.
Welcome back to campus!
Seminars are occasionally on Fridays at 11 am – 12 pm in South Hall Room 210. Throughout the semester, we will update this schedule as we invite additional speakers.
Here is the current schedule:
Sept 30: Arya McCarthy, Johns Hopkins University
Oct 14: Alex Tamkin, Stanford University
The Berkeley NLP seminar is organized by a small team of PhD students and faculty at the School of Information and EECS.