log in  |  register  |  feedback?  |  help  |  web accessibility
Logo
The Offline Multi-agent Reinforcement Learning (MARL) Coordination Problem
Tuesday, October 24, 2023, 4:00-5:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
Registration requested: The organizer of this talk requests that you register if you are planning to attend. There are two ways to register: (1) You can create an account on this site (click the "register" link in the upper-right corner) and then register for this talk; or (2) You can enter your details below and click the "Register for talk" button. Either way, you can always cancel your registration later.

Name:
Email:
Organization:

Abstract

Training multiple agents to coordinate is an important problem with applications in robotics, game theory, economics, and social sciences. However, most existing Multi-Agent Reinforcement Learning (MARL) methods are online and thus impractical for real-world applications in which collecting new interactions is costly or dangerous. While these algorithms should leverage offline data when available, doing so gives rise to the offline coordination problem. Specifically, we identify and formalize the strategy agreement (SA) and the strategy fine-tuning (SFT) challenges, two coordination issues at which current offline MARL algorithms fail. To address this setback, we propose a simple model-based approach that generates synthetic interaction data and enables agents to converge on a strategy while fine-tuning their policies accordingly. Our resulting method, Model-based Offline Multi-Agent Proximal Policy Optimization (MOMA-PPO), outperforms the prevalent learning methods in challenging offline multi-agent MuJoCo tasks even under severe partial observability and with learned world models.

Bio

Paul Barde is a Ph.D. candidate at McGill University and Mila (Quebec AI Institute) co-supervised by Prof. Derek Nowrouzezahrai and Prof. Christopher Pal.

He works on sequential decision-making where he focuses on multi-agent problems such as the emergence of communication, brain-computer interfaces, coordination, and cooperation challenges. When available, he is keen to leverage data and simulators through model-based planning and learning or imitation, inverse and offline reinforcement learning approaches.

Paul's research goal is to leverage data-driven multi-agent sequential decision-making to model intricate problems and assist us with complex decision processes. He is particularly interested in applying agent-based modeling and mechanism design approaches to biodiversity conservation challenges.

During his Ph.D., he has worked at Ubisoft La Forge with Dr. Olivier Delalleau, at INRIA (National Institute for Research in Digital Science and Technology) with Prof. Pierre-Yves Oudeyer, and most recently at FAIR (Meta Fundamental AI Research) with Prof. Amy Zhang.

Note: Please register using the Google Form on our website https://go.umd.edu/marl for access to the Google Meet and talk resources.

This talk is organized by Saptarashmi Bandyopadhyay