https://umd.zoom.us/j/
The rise of large foundational models has underscored the critical challenge of ensuring their alignment with human values and intentions. Two fundamental elements—data and algorithms—play pivotal roles in advancing this alignment. Firstly, I will introduce a data-centric approach to improve large language models (LLMs) by filtering out low-quality training data using a strong LLM. The method significantly reduces training time, offers cost savings, and proves effective across different datasets, base models, and LLM filters, emphasizing the critical importance of data quality over quantity in instruction tuning.Secondly, I will introduce a reward model training algorithm to tackle the issue of reward hacking in Reinforcement Learning from Human Feedback (RLHF) in LLMs, where models generate overly verbose but less meaningful responses to exploit reward systems for higher scores. We also propose a new evaluation protocol to accurately measure the trade-off between response length and evaluation score, demonstrating through experiments that the method effectively reduces the correlation between reward and verbosity, leading to more genuinely helpful model outputs without excessive length.
Bio:
Lichang Chen is now a fourth-year CS PhD student at University of Maryland, College Park. His research interests are in aligning Large Foundational Models, e.g., omni-modal language models with human beings better. His previous research covers algorithms, data, and evaluations of the large foundational models.