Talks

PhD Defense: Hallucinations in Multimodal Large Language Models: Evaluation, Mitigation, and Future Directions

Fuxiao Liu

IRB-4109 or https://umd.zoom.us/j/6148747387?pwd=VEtTU1NNTkYzYlU5YnJKL2piNURIQT09 Brendan Iribe Center for Computer Science and Engineering (IRB)

Monday, March 24, 2025, 12:00-2:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Zoom: https://umd.zoom.us/j/6148747387?pwd=VEtTU1NNTkYzYlU5YnJKL2piNURIQT09

Multimodal Large Language Models (MLLMs) have achieved impressive performance across a wide array of tasks. However, these models are prone to hallucinations that compromise their reliability. This thesis explores the phenomenon of hallucinations in MLLMs, focusing on their identification, underlying causes, and mitigation strategies.

We first propose a systematic evaluation framework to quantify and analyze hallucinations across multiple modalities, leveraging diverse metrics tailored to real-world scenarios. Building on this foundation, we introduce novel mitigation strategies, combining architectural improvements, fine-tuning techniques, and data augmentation approaches to reduce hallucination rates without sacrificing model versatility. Finally, we identify open challenges and outline future research directions. This work provides a comprehensive roadmap for understanding and addressing hallucinations in MLLMs, contributing to the broader goal of enhancing the robustness and reliability of AI systems.

Bio

Fuxiao Liu is a Ph.D. candidate in the Computer Science Department at the University of Maryland, College Park, advised by Abhinav Shrivastava and Yaser Yacoob. His research focuses on developing customizable large multimodal language models that align with human intent. He has published several representative works, including HallusionBench, NVEagle, LRV-Instruction, MMC, and Visual News. After graduation, he will join NVIDIA ADLR as a Research Scientist.

This talk is organized by Migo Gui