Talks

PhD Proposal: On Utilizing Learning Shortcuts for Robustness and Dataset Security

Pedro Sandoval Segura

IRB 4109 Brendan Iribe Center for Computer Science and Engineering (IRB)

Tuesday, March 26, 2024, 9:00-11:00 am

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

https://umd.zoom.us/j/7917365697

Web scraping is increasingly used to construct large datasets, necessary for training deep learning models. To prevent the exploitation of online data for unauthorized purposes, imperceptible, adversarial modifications can be crafted to cause erroneous output for neural networks trained on the modified data. Known as "unlearnable datasets", we show that these modified training datasets cause learning shortcuts and can prevent neural networks from learning representations that generalize. In this proposal, we first examine various designs and approaches for creating effective perturbations, finding that they make datasets easier to learn. Second, we propose autoregressive poisoning, a new method that can generate perturbations for unlearnable datasets without optimization, a surrogate network, or access to the broader dataset. Third, we investigate circumstances in which unlearnable datasets fail to protect data and find evidence that neural networks can learn generalizable features from ostensibly unlearnable datasets. Finally, we propose an approach for removing learning shortcuts in natural data with the goal of improving robustness to distribution shifts.

Bio

Pedro Sandoval-Segura is a PhD student at the University of Maryland, advised by Professor Tom Goldstein and Professor David Jacobs. He received his bachelor's degree from Harvey Mudd College. His research has explored vulnerabilities of neural networks including adversarial examples and dataset poisoning. He is interested in developing perception systems that are robust to corruptions and distribution shifts.

This talk is organized by Migo Gui