PhD Proposal: Effective Training and Efficient Inference of Deep Neural Networks for Visual Understanding
Hengduo Li
Remote
Abstract
Since the phenomenal improvement on large scale image classification being obtained by using deep neural networks, large research attention has been focused on improving the effectiveness (in terms of performance on target tasks) and efficiency (in terms of inference speed) of networks for visual understanding. Apart from designing effective and efficient network architectures, it is worth exploring how to train them effectively as well as how to do inference efficiently with them, particularly since these directions could be complementary to most architectural changes of deep networks.
To this end, we present methods following these two directions for effective and efficient visual understanding, primarily for the tasks of image/video classification and detection. In particular, we explore designing better supervisory signal for training object detectors and analyze different pre-training / transfer learning strategies for various downstream vision tasks in the first and second part respectively. In the third part, we propose a dynamic computation framework which adaptively allocates computational resources by selectively keeping input frames and activating 3D convolutional layers on a per-input basis for efficient inference for video classification.
Examining Committee:
To this end, we present methods following these two directions for effective and efficient visual understanding, primarily for the tasks of image/video classification and detection. In particular, we explore designing better supervisory signal for training object detectors and analyze different pre-training / transfer learning strategies for various downstream vision tasks in the first and second part respectively. In the third part, we propose a dynamic computation framework which adaptively allocates computational resources by selectively keeping input frames and activating 3D convolutional layers on a per-input basis for efficient inference for video classification.
Examining Committee:
Chair: Dr. Larry S. Davis
Dept rep: Dr. Matthias Zwicker
Members: Dr. Abhinav Shrivastava
Dept rep: Dr. Matthias Zwicker
Members: Dr. Abhinav Shrivastava
Bio
Hengduo Li is a Ph.D. student at the Department of Computer Science, working on computer vision with Prof. Larry S. Davis. His research interests lie in object/video detection and classification.
This talk is organized by Tom Hurst