Talks

CleanRL / Cleanba: The most readable and hackable RL codebase

IRB-5105 Brendan Iribe Center for Computer Science and Engineering (IRB)

Tuesday, April 23, 2024, 4:00-5:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

The talk will be on CleanRL + Cleanba. CleanRL is a highly transparent and hackable Deep Reinforcement Learning library. By leveraging single-file implementations, CleanRL has significantly fewer lines of code for its DRL implementations compared to many other libraries. CleanRL also has an interesting RLops approach to ensure no regression is introduced during refactoring. CleanRL-style approach can scale, too — Cleanba is a sister codebase that focuses on distributed DRL. Cleanba outperforms torchbeast and moolib while addresses reproducibility issues arising from distributed DRL.

Bio

Costa Huang is a Machine Learning Engineer at Hugging Face working on RLHF and RL. He holds a Ph.D. from Drexel University, with a focus on reproducible and efficient deep reinforcement learning. Notably, he is the creator of CleanRL, a researcher-friendly RL library. He also specializes in demystifying the implementation details of modern DRL algorithms – for example, he is the lead author of the blog post The 37 Implementation Details of Proximal Policy Optimization.

Note: Please register using the Google Form on our website https://go.umd.edu/marl for access to the Google Meet, Open-source Multi-Agent AI Research Community and talk resources.

This talk is organized by Saptarashmi Bandyopadhyay