Talks

Guarding Against Spurious Correlations in Natural Language Understanding

Virtual - https://umd.zoom.us/j/93207947099?pwd=c096Z3JrZ1FGSXVEVjFWL29PQUV1dz09

Wednesday, March 31, 2021, 11:00 am-12:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

While we have made great progress in natural language understanding, transferring the success from benchmark datasets to real applications has not always been smooth. Notably, models sometimes make mistakes that are confusing and unexpected to humans. In this talk, I will discuss shortcuts in NLP tasks and present our recent works on guarding against spurious correlations in natural language understanding tasks (e.g. textual entailment and paraphrase identification) from the perspectives of both robust learning algorithms and better data coverage. Motivated by the observation that our data often contains a small amount of "unbiased" examples that do not exhibit spurious correlations, we present new learning algorithms that better exploit these minority examples. On the other hand, we may want to directly augment such "unbiased" examples. While recent works along this line are promising, we show several pitfalls in the data augmentation approach.

Bio

He He is an assistant professor in the Center for Data Science and Courant Institute at New York University. Before joining NYU, she spent a year at Amazon Web Services and was a postdoc at Stanford University. She received her PhD from University of Maryland, College Park. She is broadly interested in machine learning and natural language processing. Her current research interests include text generation, dialogue systems, and robust language understanding.

This talk is organized by Wei Ai