Please join us for the NLP Seminar on Monday, October 30, at 4:00pm in 202 South Hall. All are welcome!
Speaker: Christopher Potts (Stanford Linguistics)
Title: Enriching distributional linguistic representations with structured resources
One of the most powerful ideas in natural language processing is that we can represent words and phrases using dense vectors learned from co-occurrence patterns in text. Such representations have proven themselves in many settings, and one might even argue that they make good on a common intuition among linguists: that words tend to be incredibly complex and related to each other in all sorts of subtle ways. However, co-occurrence patterns alone tend to yield only a blurry picture of the rich relationships that exist between concepts, which raises the question of how best to incorporate additional information from more structured resources. This talk will explore methods for achieving this synthesis, with special emphasis on the retrofitting method pioneered by Faruqui et al. (2015), in which existing representations are updated based on their position in a knowledge graph. I’ll describe and motivate a generalization of Faruqui et al.’s framework that explicitly models graph relations as functions (Lengerich et al. 2017), and I’ll discuss some potential pitfalls of retrofitting (Cases et al. 2017). My overall goal is to stimulate discussion about how to obtain semantically nuanced distributed representations that are useful in diverse tasks.
Cases, Ignacio; Minh-Thang Luong; and Christopher Potts. 2017. On the effective use of pretraining for natural language inference. Ms., Stanford University. https://arxiv.org/abs/1710.02076
Faruqui, Manaal; Jesse Dodge; Sujay K. Jauhar; Chris Dyer; Eduard Hovy; and Noah A. Smith. 2015. Retrofitting word vectors to semantic lexicons. NAACL. http://www.aclweb.org/anthology/N15-1184
Lengerich, Benjamin J.; Andrew L. Maas; and Christopher Potts. 2017. Retrofitting distributional embeddings to knowledge graphs with functional relations. Ms., Carnegie Mellon University, Stanford University, and Roam Analytics. https://arxiv.org/abs/1708.00112