Please join us for the next NLP Seminar on Thursday February 25 at 4pm in 205 South Hall. All are welcome!

Speaker: David Mimno (Cornell)

Title: Topic models without the randomness: new perspectives on deterministic algorithms

Abstract:

Topic models provide a useful way to identify and measure constructs in large text collections, such as themes, genres, discourses, and topics. But running popular algorithms multiple times on the same documents can produce different results, raising questions about the reliability of any resulting conclusions. I will summarize an exciting new line of research in deterministic algorithms for topic inference that trade stronger model assumptions for provably optimal performance. This new approach not only leads to better models but better computational scalability and a richer understanding of connections between topic models and related methods like LSI and word embeddings.