PhD Defense: Structural Scaffolding for Sensemaking in Document Collections
Joe Barrow
Monday, September 12, 2022, 11:00 am-1:00 pm Calendar
Readers must often attempt to make sense of too much information at once. Consider a student trying to find a relevant piece of information from a long text or learn about a topic from diverse and conflicting viewpoints found on the web. Having a scaffold of this content can help the reader to make sense of it: e.g. section titles in a document can support skimming, or a summative guide to all the viewpoints can support learning about a topic. These scaffolds help a reader build mental maps of the information; without them, it is easy to "miss the forest for the trees".

The goal of this work is to induce this structure in cases where it is not already explicit, dubbed "structural scaffolding". The focus is on two types of scaffolds. The first, topical scaffolds of documents, are labeled sections induced over an unstructured document to support pre-reading. The second, syntopical scaffolds of document collections, are viewpoints extracted, grouped, and presented from many documents at once. In addition to task- and model-building, this work has an additional facet: designing and building user-facing systems that use scaffolds to support sensemaking in collections.

The work on topical scaffolding aims to support readers engaged in "skimming" or "pre-reading" documents. To do this, documents are segmented, and the segments are given topical labels related to their content. This work shows, through model-building, that when topically scaffolding a document, it is useful to jointly model the segmentation problem and the segment-labeling problem.

The work on syntopical scaffolding seeks to operationalize the act of "syntopical reading," whereby a reader learns about a topic by comparing and contrasting different authors' viewpoints. By modeling the relationship between claims within and across documents, it is possible to more accurately reconstruct authors' viewpoints. These viewpoints can then be presented in a reader-facing system to a syntopical reader to help them analyze the discussion.

Examining Committee:
Dean's Representative:
Dr. Philip Resnik    
Dr. Douglas W. Oard    
Dr. John Dickerson    
Dr. Rachel Rudinger    
Dr. Jordan Boyd-Graber    
Dr. Rajiv Jain (Adobe Research)

Joe Barrow is a PhD student working with Professor Philip Resnik and Professor Doug Oard on natural language processing. He is interested in systems that help people make sense of document collections.

This talk is organized by Tom Hurst