Large Language Models (LLMs) have emerged as powerful tools in the medical domain, demonstrating human-level capabilities that enable applications ranging from clinical trial matching to risk prediction and biomedical knowledge retrieval. These advances position LLMs as promising agents in healthcare, yet their safe and trustworthy deployment remains hindered by critical challenges, including bias, robustness, and patient privacy.
We investigate the limitations of LLMs in clinical contexts through both empirical analysis and system design. Using real-world patient data, we uncover racial disparities in LLM-generated medical reports and demonstrate the susceptibility of both proprietary and open-source models to manipulation across multiple clinical tasks. To address these concerns, we propose a principled framework for trustworthy medical AI grounded in five core principles: Truthfulness, Resilience, Fairness, Robustness, and Privacy. Within this framework, we introduce a comprehensive benchmark of 1,000 expert-verified clinical questions designed to assess model behavior under sensitive scenarios. We further expand the benchmark with open-ended question formats, revealing that LLM performance remains largely consistent with multiple-choice settings, indicating persistent risks across interaction modalities.
Recognizing privacy as a cornerstone of safety, we examine the memorization behavior of LLMs in clinical settings. Through controlled experiments, we identify key hyperparameters associated with memorization and propose a novel inference-time method that significantly reduces memorization risk while preserving medical task performance, without the need for model retraining.
Together, these efforts lay the foundation for future research aimed at ensuring LLMs in healthcare are not only high-performing, but also equitable, privacy-preserving, and aligned with the ethical standards required for safe and responsible clinical deployment.
Yifan Yang is a Ph.D. candidate at the University of Maryland, College Park, and a visiting fellow at the National Library of Medicine, National Institutes of Health (NIH). His research lies at the intersection of biomedicine and computer science, with a focus on building trustworthy and safe artificial intelligence systems, particularly large language models (LLMs), for biomedical applications.