Talks

PhD Defense: Towards Reliable and Efficient Representation Learning

Chen Zhu

3137 Brendan Iribe Center for Computer Science and Engineering (IRB)

Friday, May 13, 2022, 3:30-5:30 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Large-scale representation learning has achieved enormous success during the past decade, surpassing human-level accuracy on a range of benchmarks including image recognition and language understanding. The success is supported by advances in both the algorithms and computing capabilities, which enables training large models on enormous amounts of data. While the performance continues to improve on existing benchmarks with larger model and training dataset sizes, the reliability and efficiency of large models are often questioned for deployment in practice. Uncensored datasets can have been poisoned to manipulate model behavior, while practical deployment requires models to be trained or updated quickly on the latest data, and to have low latency for inference.

In this talk, I will introduce our works on improving the reliability and efficiency of representation learning. On reliability, we study the threats of data poisoning and evasion attacks and how to defend against these threats. We propose a more vicious targeted clean-label poisoning attack that is highly effective even when the target architecture is unknown. To defend against such threats, we develop a k-NN based method in the feature space to filter out the poison examples from the training set, which effectively reduces the success rate of poisoning attacks at an insignificant cost of accuracy.

On efficiency, our study focuses on three dimensions: data efficiency, convergence speed and computational complexity. For data efficiency, we propose enhanced adversarial training algorithms as a general data augmentation technique to improve the generalization of models given the same amount of labeled data, where we show its efficacy for Transformer models on language understanding and vision-and-language tasks, as well as for Graph Neural Networks. For convergence speed, we propose an automated initialization scheme to accelerate the convergence of convolutional networks for image recognition and Transformers for machine translation. For computational complexity, to scale Transformers to long sequences, we propose a linear-complexity attention mechanism, which improves the efficiency while preserving the performance of full attention on a range of language and vision tasks.

Examining Committee:

Chair:
Dean's Representative:
Members:

Dr. Tom Goldstein
Dr. Behtash Babadi
Dr. David Jacobs
Dr. Furong Huang
Dr. Rachel Rudinger
Dr. John P. Dickerson

Bio

Chen Zhu joined the Ph.D. in Computer Science program at UMD in 2018. His adviser is Prof. Tom Goldstein. His research focuses on developing better algorithms to enhance the efficiency and robustness of neural networks for applications in Computer Vision and Natural Language Processing. Before coming to UMD, he obtained a master’s degree from ShanghaiTech University, where he worked with Prof. Kewei Tu and Prof. Yi Ma, and a bachelor’s degree from Beihang University. Meanwhile, he has worked as a research intern at Google, Nvidia, Microsoft, Baidu and Intel.

This talk is organized by Tom Hurst