Talks

PhD Defense: Effective Training and Efficient Inference of Deep Neural Networks for Visual Understanding

Hengduo Li

5105 Brendan Iribe Center for Computer Science and Engineering (IRB)

Monday, June 6, 2022, 11:00 am-1:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Since the phenomenal success of deep neural networks (DNNs) on image classification, the research community have been developing wider and deeper networks with complex components for a variety of visual understanding tasks. While such ``heavy'' models achieve excellent performance, they pose two main challenges: (1) the training requires a significant amount of computational resource as well as large-scale labeled datasets acquired from time-consuming and labor-intensive human annotation process; and (2) the inference can be slow even with expensive graphics cards due to the high model complexity. To address these challenges, we explore improving the effectiveness of training DNNs so that better performance is achieved under the same computation and/or annotation cost during training, and improving the efficiency of inference that reduces the computational cost of DNNs while maintaining high accuracy.

In this dissertation, we first propose several approaches including devising noise-aware supervisory signals, developing better semi-supervised learning methods and analyzing different pre-training techniques for training object recognition and detection models more effectively. In the second part, we present two adaptive computation frameworks that improve the inference efficiency of 3D convolutional networks and attention-based Vision Transformers for the tasks of image and video classification.

Examining Committee:

Chair:
Co-Chair:
Dean's Representative:
Members:

Dr. Larry S. Davis
Dr. Abhinav Shrivastava
Dr. Joseph F. JaJa
Dr. Matthias Zwicker
Dr. David Jacobs

Bio

Hengduo Li is a Ph.D. candidate in Computer Science advised by Professor Larry S. Davis. His research interests lie in improving the effectiveness and efficiency of training and inference of deep networks for computer vision, with a primary focus on the topics of recognition and detection in images and videos.

This talk is organized by Tom Hurst