Please join us for the NLP Seminar on Monday, February 26  at 4:00pm in 202 South Hall.   All are welcome!

Speaker:  Jonathan Kummerfeld: U Michigan

Title:  Representing Online Conversation Structure with Graphs: A New Corpus and Model


When a group of people communicate online, their conversation is rarely linear, with each message responding only to the one immediately before it. To build systems that understand a group conversation we need a way to identify the discourse structure–what each message is responding to. I’ll speak about a new corpus we constructed with reply structure annotations for 19,924 messages across 58 hours of IRC discussion. Using our annotations we analyse strengths and weaknesses of a recent heuristically extracted set of conversations that have formed the basis of extensive work on dialogue systems (Lowe et al., 2015). Finally, I’ll present statistical models for the task, which improve thread extraction performance from 25.7 F (heuristic) to 60.3 F (our approach). Using our model we extract a new set of conversations that provide high quality data for use in downstream dialogue system development.

( Slides )