Talks

Leveraging Multimodal Foundation Models for Exploring the 3D World

Hadar Averbuch-Elor

IRB 4105 or https://umd.zoom.us/j/95853135696?pwd=VVEwMVpxeElXeEw0ckVlSWNOMVhXdz09

Thursday, March 14, 2024, 1:00-2:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Multimodal foundation models have recently shown great promise for a wide array of tasks involving images, text and 3D geometry, pushing the boundaries of what was considered impossible just a few years ago. However, these models are not easily interpretable, hindering their potential usage and adaptation for various tasks.

In this talk, I will present an ongoing line of research that leverages multimodal foundation models for addressing the task of semantic editing over 3D objects, and demonstrate the paradigm shift between how we addressed this task several years ago and how we address it today in the presence of these powerful models. By considering both their internal mechanisms as well as their functional subparts in a standalone manner, our work also provides new angles for understanding what is learned by these multimodal foundation models. Finally, I will discuss several future directions.

Bio

Hadar Averbuch-Elor is an Assistant Professor at the School of Electrical Engineering in Tel Aviv University. Before that, Hadar was a postdoctoral researcher at Cornell-Tech, working with Noah Snavely. She completed her PhD in Electrical Engineering at Tel-Aviv University, where she was advised by Daniel Cohen-Or. Hadar is a recipient of multiple awards including the Zuckerman Postdoctoral Scholar Fellowship and the Schmidt Postdoctoral Award for Women in Mathematical and Computing Sciences. She was also awarded the Alon Scholarship for the Integration of Outstanding Faculty by Israel's Council for Higher Education, and was selected as a Rising Star in EECS by UC Berkeley. Hadar's research interests lie in the intersection of computer graphics and computer vision, particularly in combining pixels with more structured modalities, such as natural language and 3D geometry.

This talk is organized by Samuel Malede Zewdu