PhD Proposal: Towards Safety and Trust in Large Language Models in Medicine
Yifan Yang
Abstract
Abstract:
Large Language Models (LLMs) have recently gained significant attention in the medical field for their human-level capabilities, sparking considerable interest in their potential applications across healthcare. Along with proposing guidelines and conducting reviews for these applications, we also spend efforts towards applying LLMs in medicine, including matching patients to clinical trials using LLMs, augmenting LLMs with domain-specific tools for improved access to biomedical information, and empowering language agents for risk prediction through large-scale clinical tool learning.
Despite their promise, real-world adoption faces critical challenges, with risks in practical settings that have not been systematically characterized. In this proposal, we identify and quantify biases in LLM-generated medical reports, specifically uncovering disparities affecting patients of different racial backgrounds. Using real-world patient data, we further show that both open-source and proprietary LLMs can be manipulated across multiple tasks, underscoring the need for rigorous evaluation. To address these challenges, we propose five core principles for safe and trustworthy medical AI—Truthfulness, Resilience, Fairness, Robustness, and Privacy—along with ten specific evaluative criteria. Under this framework, we introduce a comprehensive benchmark, featuring 1,000 expert-verified questions to rigorously assess LLM performance in sensitive clinical contexts.
Through these efforts, we present existing results and propose future research directions aimed at ensuring that LLMs in healthcare are both safe and trustworthy.
Despite their promise, real-world adoption faces critical challenges, with risks in practical settings that have not been systematically characterized. In this proposal, we identify and quantify biases in LLM-generated medical reports, specifically uncovering disparities affecting patients of different racial backgrounds. Using real-world patient data, we further show that both open-source and proprietary LLMs can be manipulated across multiple tasks, underscoring the need for rigorous evaluation. To address these challenges, we propose five core principles for safe and trustworthy medical AI—Truthfulness, Resilience, Fairness, Robustness, and Privacy—along with ten specific evaluative criteria. Under this framework, we introduce a comprehensive benchmark, featuring 1,000 expert-verified questions to rigorously assess LLM performance in sensitive clinical contexts.
Through these efforts, we present existing results and propose future research directions aimed at ensuring that LLMs in healthcare are both safe and trustworthy.
Bio
Yifan Yang is a PhD student in the Computer Science Department at the University of Maryland (UMD) and a visiting fellow at the National Center for Biotechnology Information (NCBI), National Library of Medicine (NLM), National Institutes of Health (NIH). His research centers on trustworthy biomedical AI, with a particular focus on the application of large language models (LLMs) in the biomedical domain.
This talk is organized by Migo Gui