Please join us for the NLP Seminar on Monday 11/14 at 3:30pm in 202 South Hall. All are welcome!
Speaker: David Jurgens (Stanford)
Title: Citation Classification for Behavioral Analysis of a Scientific Field
Abstract:
Citations are an important indicator of the state of a scientific field, reflecting how authors frame their work, and influencing uptake by future scholars. However, our understanding of citation behavior has been limited to small-scale manual citation analysis. We perform the largest behavioral study of citations to date, analyzing how citations are both framed and taken up by scholars in one entire field: natural language processing. We introduce a new dataset of nearly 2,000 citations annotated for function and centrality, and use it to develop a state-of-the-art classifier and label the entire ACL Reference Corpus. We then study how citations are framed by authors and use both papers and online traces to track how citations are followed by readers. We demonstrate that authors are sensitive to discourse structure and publication venue when citing, that online readers follow temporal links to previous and future work rather than methodological links, and that how a paper cites related work is predictive of its citation count. Finally, we use changes in citation roles to show that the field of NLP is undergoing a significant increase in consensus.
Preparatory Readings:
- Collins, Randall. “Why the social sciences won’t become high-consensus, rapid-discovery science (Links to an external site.).” Sociological forum. Vol. 9. No. 2. Kluwer Academic Publishers-Plenum Publishers, 1994.
- Teufel, Simone, Advaith Siddharthan, and Dan Tidhar. “Automatic classification of citation function (Links to an external site.).” EMNLP 2006