log in  |  register  |  feedback?  |  help  |  web accessibility
Logo
Using Tree Structures for Improved Dependency Parsing Algorithms
Wednesday, October 29, 2014, 11:00 am-12:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
Abstract

Dependency parse trees have been found to be particularly useful for machine translation, question answering, and other practical applications. The two most common search spaces are the set of projective trees or the set of all directed spanning trees. The first requires sacrificing coverage of non-projective structures, which commonly occur in natural language; the second introduces significant computational challenges for many scoring models.  In this talk we show how milder assumptions about output tree structures can help us to design parsers that can produce the vast majority of natural language structures while at the same time introducing only constant-time overhead over projective parsers. 

This talk will mostly focus on graph-based non-projective parsing.  We introduce 1-Endpoint-Crossing trees; this property covers 95.8% or more of dependency parses across a variety of languages.  We then introduce a “crossing-sensitive” generalization of a third-order factorization that trades off complexity in the model structure (i.e., scoring with features over multiple edges) with complexity in the output structure (i.e., producing crossing edges).  The third-order 1-Endpoint-Crossing parser has the same asymptotic run-time as the third-order projective parser and is significantly more accurate under many experimental settings and significantly less accurate on none.

The same narrative applies to transition-based parsing: by making similar assumptions on the structures of crossing dependencies, we can design a transition-based parser that is sound and complete for a broad class of trees while adding only constant overhead in run-time compared to popular projective systems.

Bio

Emily Pitler is a Research Scientist at Google, where she works on natural language parsing and its applications.  She received her Ph.D. in Computer and Information Science from the University of Pennsylvania in 2013.  She received her undergraduate degree in Computer Science from Yale University.

 

This talk is organized by Jimmy Lin