log in  |  register  |  feedback?  |  help  |  web accessibility
Logo
MS Defense: Citation Handling: Processing Citation Texts in Scientific Documents
Michael Whidby - University of Maryland, College Park
Friday, June 15, 2012, 10:00-11:00 am Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
Abstract

THE THESIS DEFENSE FOR THE M.S. DEGREE IN COMPUTER SCIENCE FOR

                                    Michael Whidby

Citation sentences (sentences that cite other papers) play a key role in the summarization of scientific articles. However, a citation-based summarization system that depends on generic natural language processing components, such as parsers or sentence compressors, will perform poorly if those components cannot handle citations correctly.

In this thesis, I examine the effect of citation handling on parsing, sentence compression, and multi-document summarization. There are two types of citations that occur in citation sentences: constituent citations and parenthetical citations. I propose an automatic citation classifier based on training data created through Mechanical Turk tasks. I demonstrate that the use of type-specific citation handling as pre-processing improves the performance of a state-of-the-art generic parser, both for quality of the parse trees and running time. Extrinsic evaluations demonstrate that improving the performance of a parser on citation sentences in turn improves the performance of a sentence compressor, Trimmer (Zajic et al., 2007), and a multi-document summarization system, MASCS, according to several summarization measures.

 

Examining Committee:

 

 

CO-ADVISOR                                    Dr. David Zajic

Committee Member(s):                      Dr. Hal Daume III

 

EVERYONE IS INVITED TO ATTEND THE PRESENTATION PORTION OF THIS DEFENSE

This talk is organized by Jeff Foster