Talks

PhD Defense: Transfer Learning in Natural Language Processing through Interactive Feedback

Michelle Yuan

4107 Brendan Iribe Center for Computer Science and Engineering (IRB)

Wednesday, July 6, 2022, 1:00-3:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Machine learning models cannot easily adapt to new domains and applications. This drawback becomes detrimental for natural language processing (NLP) because language is perpetually changing. Across disciplines and languages, there are noticeable differences in content, grammar, and vocabulary. To overcome these shifts, recent NLP breakthroughs focus on transfer learning. Through clever optimization and engineering, a model can successfully adapt to a new domain or task. However, these modifications are still computationally inefficient or resource-intensive. Compared to machines, humans are more capable at generalizing knowledge across different situations, especially in low-resource ones. Therefore, the research on transfer learning should carefully consider how the user interacts with the model. The goal of this dissertation is to investigate “human-in-the-loop” approaches for transfer learning in NLP.

We first explore interaction for problems in inductive transfer learning, which is the transfer of models across tasks. Language models, like BERT, are popular because they can be used for various applications. However, these models require a large amount of labeled data to learn a new task. To reduce labeling, we develop an active learning strategy which samples documents that surprise the language model. Users only need to annotate a small subset of these unexpected documents to adapt the language model for text classification.

Then, we transition to user interaction in transductive transfer learning, which is the transfer of models across domains. For cross-lingual text classification, we develop interactive systems for word embeddings and topic models. The approaches are useful for aligning English with a low-resource language. Beyond text classification, we look at domain shift for coreference resolution, a task that is fundamental for applications like question-answering and dialogue. We use active learning to find spans of text in the new domain for users to label. Finally, we conclude with future directions for research in interactive transfer learning.

Examining Committee:

Chair:
Dean's Representative:
Members:

Dr. Jordan Boyd-Graber
Dr. Philip Resnik
Dr. Benjamin Van Durme (Johns Hopkins)
Dr. Rachel Rudinger
Dr. John Dickerson

Bio

Michelle Yuan is a PhD candidate in the Computer Science Department at the University of Maryland. She is advised by Jordan Boyd-Graber and a member of the Computational Linguistics and Information Processing (CLIP) Lab. Over these years, she also has closely collaborated with Benjamin Van Durme and others at the Human Language Technology Center of Excellence (HLTCOE). Her research focuses on designing user annotation and feedback frameworks to improve transfer of NLP models.

This talk is organized by Tom Hurst