Please join us for our final NLP Seminar of the spring semester on Monday, May 1, at 3:30pm in 202 South Hall.
Speaker: Pramod Viswanath, University of Illinois
Title: Geometries of Word Embeddings
Real-valued word vectors have transformed NLP applications; popular examples are word2vec and GloVe, recognized for their ability to capture linguistic regularities via simple geometrical operations. In this talk, we demonstrate further striking geometrical properties of the word vectors. First we show that a very simple, and yet counter-intuitive, post-processing technique, which makes the vectors “more isotropic”, renders off-the-shelf vectors even stronger. Second, we show that a sentence containing a target word is well represented by a low rank subspace; subspaces associated with a particular sense of the target word tend to intersect over a line (one-dimensional subspace). We harness this Grassmannian geometry to disambiguate (in an unsupervised way) multiple senses of words, specifically so on the most promiscuously polysemous of all words: prepositions. A surprising finding is that rare senses, including idiomatic/sarcastic/metaphorical usages, are efficiently captured. Our algorithms are all unsupervised and rely on no linguistic resources; we validate them by presenting new state-of-the-art results on a variety of multilingual benchmark datasets.
Please join us for the NLP Seminar Monday, April 24 at 3:30pm in 202 South Hall.
Speaker: Marta Recasens (Google)
There’s Life Beyond Coreference
I’ll give a bird’s eye view of the coreference resolution task, discussing why after more than two decades of research on this task, state-of-the-art systems are still far from performing satisfactorily for real applications. Then, I’ll focus on the long tail of the problem, exemplifying how to cheaply learn common sense of the kind required by the Winograd Schema Challenge, and I’ll finish by undermining the traditional definition of the task, whose attempt at simplifying the problem may be making it even harder.
Please join us for the NLP Seminar Monday, April 10 at 3:30pm in 202 South Hall.
Speaker: Danqi Chen (Stanford)
Title: Towards the Machine Comprehension of Text
Enabling a computer to understand a document so that it can answer comprehension questions is a central, yet unsolved goal of NLP. The task of reading comprehension (i.e., question answering over unstructured text) has received vast attention recently, and a lot of progress has been made thanks to the creation of large-scale datasets and development of attention-based neural networks.
In this talk, I’ll first present how we advance this line of research. I’ll show how simple models can achieve (nearly) state-of-the-art performance on recent benchmarks, including the CNN/Daily Mail datasets and the Stanford Question Answering Dataset. I’ll focus on explaining the logical structure behind these neural architectures and discussing advantages as well as limits of current approaches.
Lastly I’ll talk about how we leverage existing machine comprehension systems and enable them to answer open-domain questions using full Wikipedia. We demonstrate the promise of our system, as well as set up new benchmarks by evaluating on multiple existing QA datasets.
Danqi Chen is a Ph.D. candidate in Computer Science at Stanford University, advised by Prof. Christopher Manning. Her main research interests lie in deep learning for natural language processing and understanding, and she is particularly interested in the intersection between text understanding and knowledge reasoning. She has been working on machine comprehension, question answering, knowledge base population and dependency parsing. She is a recipient of a Facebook fellowship and a Microsoft Research Women’s Fellowship and an outstanding paper award at ACL’16. Prior to Stanford, she received her B.S. from Tsinghua University in 2012.