Talks

PhD Defense: Enhancing Trustworthiness and Safety in Foundation Models

Yihan Wu

IRB-4105 https://umd.zoom.us/j/5482450617

Thursday, November 6, 2025, 3:00-5:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

The rapid deployment of machine learning systems in safety-critical and high-impact applications has amplified the need for models that are both trustworthy and efficient. In this talk, I will present two complementary lines of research toward this goal. First, I will discuss advances in adversarial robustness, focusing on both classification and retrieval models. I will highlight new algorithmic and theoretical insights that improve resilience against adversarial perturbations while preserving accuracy and scalability. Second, I will introduce recent progress on watermarking techniques for large-scale foundation models, which aim to enable reliable attribution and responsible usage of model outputs without degrading their quality. Together, these directions underscore a broader vision of developing principled methods that enhance the safety, reliability, and accountability of modern machine learning systems.

Bio

Yihan Wu is a Ph.D. candidate in the Department of Computer Science at the University of Maryland, College Park, advised by Prof. Heng Huang. His research centers on developing safe and reliable machine learning models and algorithms with strong theoretical foundations, spanning topics such as adversarial robustness, safety in foundation models, and efficiency in large-scale systems. His work has been published in top-tier machine learning and AI venues, including NeurIPS, ICML, ICLR, KDD, AAAI, and ACL.

This talk is organized by Migo Gui