Jeff Wu from OpenAI will be giving a talk at the Berkeley NLP seminar.

Time: Oct 21 from 11am-12pm PST

Location: South Hall 210

Title: Training models to critique themselves

Abstract: We study the setting of large language models critiquing themselves in natural language. We find that:
  1. Critiques help humans find flaws in summaries that they would have otherwise missed.
  2. Larger models write more helpful critiques, and on most tasks are better at self-critiquing.
  3. Larger models can use their own self-critiques, refining their own summaries into better ones.
  4. We suggest methodology for and find evidence that our models’ critiques may not be able to surface all its relevant knowledge of flaws.

Bio: Jeff Wu is a research engineer at OpenAI working on language modeling (e.g. GPT-2) and alignment (InstructGPT).