log in  |  register  |  feedback?  |  help  |  web accessibility
Logo
Memorization in Large Language Models
Tuesday, February 14, 2023, 11:00 am-12:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
Abstract

Modern NLP is dominated by scale---today's language models (LMs) use supermassive parameter counts, dataset sizes, and compute budgets. In this talk, I will show that large LMs "memorize” their training data in various settings. This can sometimes be beneficial, e.g., memorization allows models to learn and recall knowledge from their pre-training data when solving downstream tasks. On the other hand, memorization can lead to legal concerns (e.g., generating copyright data, outputting medical documents) and an over-reliance on memorization can lead to failures in reasoning for novel tasks and inputs. Throughout the talk, I will give a particular focus on actionable insights that we can derive from these analyses, especially with respect to training strategies, model architectures, and dataset design.

Papers discussed: https://arxiv.org/abs/2012.07805https://arxiv.org/abs/2202.06539https://arxiv.org/abs/2207.00099https://arxiv.org/abs/2211.08411https://arxiv.org/abs/2301.13188and a final WIP paper.

Bio

Eric Wallace is a 4th year PhD student at UC Berkeley advised by Dan Klein and Dawn Song. His research interests are in making large language models more robust, trustworthy, secure, and private. Eric's work is supported by the Apple Fellowship in AI/ML, and in the past he has been at FAIR, AI2, and the University of Maryland.

This talk is organized by Rachel Rudinger