Month: February 2017

Please join us for the NLP Seminar on Monday 2/27 at 3:30pm in 202 South Hall.  All are welcome!

SpeakerJayant Krishnamurthy (Allen Institute for AI)

Title: Semantic Parsing to Probabilistic Programs for Situated Question Answering

Abstract:

Situated question answering is the problem of answering questions about an environment such as an image or diagram. This problem is challenging because it requires jointly interpreting a question and an environment using background knowledge to select the correct answer. We present Parsing to Probabilistic Programs, a novel situated question answering model that can use background knowledge and global features of the question/environment interpretation while retaining efficient approximate inference. Our key insight is to treat semantic parses as probabilistic programs that execute nondeterministically and whose possible executions represent environmental uncertainty. We evaluate our approach on a new, publicly-released data set of 5000 science diagram questions, outperforming several competitive classical and neural baselines.

(Slides)

Please join us for the NLP Seminar on Monday 2/13 at 3:30pm in 202 South Hall.  All are welcome!

Speaker:  Stephan Meylan (UC Berkeley)

Title: Word forms are optimized for efficient communication

Abstract:

The inverse relationship between word length and use frequency, first identified by G.K. Zipf in 1935, is a classic empirical law that holds across a wide range of human languages.  We demonstrate that length is one aspect of a much more general property of words: how distinctive they are with respect to other words in a language. Distinctiveness plays a critical role in recognizing words in fluent speech, in that it reflects the strength of potential competitors when selecting the best candidate for an ambiguous signal. Phonological information content, a measure of a word’s probability under a statistical model of a language’s sound or character sequences, concisely captures distinctiveness. Examining large-scale corpora from 13 languages, we find that distinctiveness significantly outperforms word length as a predictor of frequency. This finding provides evidence that listeners’ processing constraints shape fine-grained aspects of word forms across languages.

(Slides)