Talks

PhD Proposal: PhD Preliminary: Pruning for Efficient Deep Learning: From CNNs to Generative Models

Alireza Ganjdanesh

IRB-5105 Jeong H. Kim Engineering Building (KEB)

Friday, January 26, 2024, 10:00-11:30 am

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

My research focuses on developing novel model pruning techniques and efficient architecture search methods for deep computer vision models for visual recognition and generative modeling. Despite the impressive achievements of deep learning models in computer vision domains, their tremendous memory and computational requirements make their cloud deployment costly for private companies and hinder their usage on resource-constrained edge devices. Thus, model pruning is a crucial step before deploying deep models in real-world applications.

On the recognition side, first, we introduce a pruning method based on the interpretations of a model's decisions. Previous model pruning ideas mostly focus on the model's `weights' (estimating the importance of sub-structures) or `outputs' (knowledge distillation) to reduce their size. Instead, we focus on another dimension, the `inputs' of the model, and argue that the pruned model should have similar decision criteria to the original one for each input sample. Second, we develop an efficient differentiable architecture search method to find the optimal kernel sizes for the convolution layers of CNN classifiers. Previous architecture search methods are usually resource intensive, and they mainly focus on the top-1 accuracy of the model on ImageNet, which may even have a negative correlation with its performance on other tasks. We show that the few works for learning kernel sizes of CNNs either degrade the model's performance or cannot scale to larger input image sizes. We develop a framework in which we jointly train two models, called size predictor and kernel predictor, to determine the kernel sizes in a fraction of the original model's training time. Lastly, on the generative modeling side, we propose a pruning method for Image-to-Image translation Generative Adversarial Networks (GANs) to encourage the pruned model to have similar local density structures on the neighborhoods of the original model's data manifold. We discuss that the previous approaches that mainly combine the pruning techniques for classifiers fail to prune GANs and lead to mode collapse. We design an adversarial pruning scheme in which two pruning agents collaboratively prune both the generator and discriminator. By doing so, our method can preserve the balance between the generator and discriminator better than the baselines and achieve higher performance.

Finally, we take a peek into two of our recent projects in which we aim to develop pruning methods for diffusion probabilistic models. The first one prunes a pretrained diffusion model into a mixture of efficient experts. The second one performs prompt-based pruning for text-to-image diffusion models.

Bio

Alireza Ganjdanesh is a Ph.D. student in the Computer Science department at University of Maryland, College Park. His advisor is Dr. Heng Huang. Alireza's research is focused on the efficiency of deep learning models and reducing their deployment cost in real-world applications for both discriminative and generative models.

This talk is organized by Migo Gui