Talks

PhD Proposal: Object-Attribute Compositionality for Visual Understanding

Nirat Saini

4105 or https://umd.zoom.us/my/niratsaini?pwd=eFh1aWNMV1FYSFBEdWtIM0FDTWVadz09 Brendan Iribe Center for Computer Science and Engineering (IRB)

Wednesday, April 26, 2023, 1:00-3:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Object appearances evolve overtime, which results in visually discernible changes in their colors, shapes, sizes and materials. Humans are innately good at recognizing and understanding the evolution of object states, which is also crucial for visual understanding across images and videos. However, current vision models still struggle to capture and account for these subtle changes to recognize the objects and underlying action causing the changes.

This thesis focuses on recognizing objects along with their states (also referred to as attributes) using compositional learning. Firstly, we propose to disentangle visual features for object and attributes, to generalize recognition for novel object-attribute pairs. Next, we extend this approach to learn entirely unseen attribute-object pairs, by using label smoothing and propagation techniques. Further, we use object states for action recognition in videos where subtle changes in object attributes and affordances help in identifying state-modifying and context-transforming actions. All of these methods for decomposing and composing objects and states generalize to unseen pairs and out-of-domain datasets for various compositional zero-shot learning and action recognition tasks. Finally, we introduce the task of Compositional Image Generation as well as discuss the implications of these approaches for other compositional tasks in images, videos, and beyond.

Examining Committee

Chair:	Dr. Abhinav Shrivastava
Department Representative:	Dr. Ramani Duraiswami
Members:	Dr. Pratap Tokekar
	Dr. Ishan Misra (Meta AI)

Bio

Nirat Saini is a PhD student at the University of Maryland, College Park, advised by Prof Abhinav Shrivastava. The focus of her research is using compositional learning techniques for recognizing different shapes and forms for objects, and leveraging those object states for building visual understanding models. Her areas of interests are zero-shot and few-shot learning, compositional learning, action recognition, and video understanding.

This talk is organized by Tom Hurst