Talks

PhD Defense: Interpreting Deep Learning Models and Unlocking New Capabilities With It

Samyadeep Basu

IRB-4105 or https://umd.zoom.us/j/2029529447?pwd=WThzQmpuMmJpL2ZSd2trL1lJQWlNdz09 Brendan Iribe Center for Computer Science and Engineering (IRB)

Monday, March 24, 2025, 3:30-5:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Zoom:

https://umd.zoom.us/j/2029529447?pwd=WThzQmpuMmJpL2ZSd2trL1lJQWlNdz09

In recent years, modern deep learning has made significant strides across various domains, including natural language processing, computer vision, and speech recognition. Innovations in scaling pre-training data, developing new model architectures, integrating distinct modalities (e.g., vision and language, audio and language), and employing modern engineering practices have driven these advancements. However, despite the improvements in building better models, progress in understanding these models to enhance their reliability and capabilities has been relatively slow. In this talk, I will lay the groundwork for interpreting modern deep learning models—such as language, vision, text-to-image, multimodal language models—by examining them through the perspectives of data and their internal model components. In the first part of this talk, I will discuss the interpretation of test-time predictions through group influence functions and highlight the limitations of practical usage of influence functions for deep networks. We will then discuss FastDiffSel, an optimization-based algorithm that can automatically curate difficult few-shot tasks to stress test deep models. In the second part of the talk, I will present methods for mechanistically understanding the storage of visual concepts in various deep models (including discriminative and generative models). Specifically, I will present our insights about knowledge storage across text-to-image models, multimodal language models, and vision transformers. I will then demonstrate how leveraging these insights allows for model editing (steering) in different deep models (including Stable-Diffusion, LLaVa and modern ViTs) to prevent the generation of copyrighted content, equip these models with newer factual knowledge and mitigate spurious correlations. Finally, I will delve into extracting circuits in language models for extractive question-answering. I will present and show that insights from mechanistic circuits can enable several applications in language models such as data-attribution and model steering.

Bio

Samyadeep Basu is a 4th year CS PhD student advised by Dr. Soheil Feizi. His research interests are in building methods towards understanding deep learning models and using those insights to steer and control these models for applications such as model editing. He has previously interned twice at Adobe Research and Microsoft Research on model interpretability projects.

This talk is organized by Migo Gui