PhD Defense: Adversarial Vulnerabilities of Deep Networks and Affordable Robustness
Ali Shafahi
Virtual
Abstract
Deep learning has improved the performance of many computer vision tasks. However, the features that are learned without extra regularization are not necessarily interpretable. While in terms of generalization, conventionally trained models seem to perform really well, they are susceptible to certain failure modes. Even-though these catastrophic failure cases rarely happen naturally, an adversary can engineer them by having some knowledge about the design process.
Based on the time that the adversary manipulates the system, we can classify threats into evasion attacks or data poisoning attacks. First, we will cover a recently proposed data poisoning threat model that does not assume that the adversary has control over the labeling process. We call this attack the ``targeted clean-label'' poisoning attack. The proposed attack successfully causes misclassification of a target instance both under end-to-end training and transfer learning scenarios without degrading the overall performance of the classifier on non-target examples.
We will then shift our focus to evasion attacks. Evasion attacks can either be universal perturbations or per-instance adversarial examples. In the last part of this dissertation, we will present methods for training per-instance robust models (i.e., models that can resist adversarial examples) in settings where we have limited resources. One case of limited resources is the scarcity of computing power. In this case, we will present our algorithm called “Adversarial Training for Free!” which enables us to train robust models with the same computational cost of conventional/natural training. We achieve this efficiency by simultaneously updating the network parameters and the adversarial perturbation. For cases where we have limited training data per class, we introduce adversarially robust transfer learning.
Examining Committee:
Based on the time that the adversary manipulates the system, we can classify threats into evasion attacks or data poisoning attacks. First, we will cover a recently proposed data poisoning threat model that does not assume that the adversary has control over the labeling process. We call this attack the ``targeted clean-label'' poisoning attack. The proposed attack successfully causes misclassification of a target instance both under end-to-end training and transfer learning scenarios without degrading the overall performance of the classifier on non-target examples.
We will then shift our focus to evasion attacks. Evasion attacks can either be universal perturbations or per-instance adversarial examples. In the last part of this dissertation, we will present methods for training per-instance robust models (i.e., models that can resist adversarial examples) in settings where we have limited resources. One case of limited resources is the scarcity of computing power. In this case, we will present our algorithm called “Adversarial Training for Free!” which enables us to train robust models with the same computational cost of conventional/natural training. We achieve this efficiency by simultaneously updating the network parameters and the adversarial perturbation. For cases where we have limited training data per class, we introduce adversarially robust transfer learning.
Examining Committee:
Chair: Dr. Tom Goldstein
Dean's rep: Dr. Tudor Dumitras
Members: Dr. John Dickerson
Dr. Furong Huang
Dean's rep: Dr. Tudor Dumitras
Members: Dr. John Dickerson
Dr. Furong Huang
Dr. Gavin Taylor
Bio
Ali Shafahi is a Ph.D. candidate in computer science advised by Tom Goldstein. His research interests are machine learning and operations research. His recent studies have been on the topic of adversarial machine learning.
This talk is organized by Tom Hurst