Talks

PhD Defense: Multi-Agent Reinforcement Learning: Systems for Evaluation and Applications To Complex Systems

Jordan Terry

Online: https://umd.zoom.us/j/2920984437

Wednesday, January 18, 2023, 12:30-2:30 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Reinforcement learning is a field of artificial intelligence that studies methods for agents to learn by trial and error to take actions in a given system. Famous examples of it have included learning to control real robots, or achieving superhuman performance in most of the most popular and challenging games for humans.

In order to conduct research in this space, researchers use standardized ``environments'', such as robotics simulations or video games, to evaluate the performance of learning methods. This thesis covers PettingZoo, a library that offers a standardized API and set of reference environments for multi-agent reinforcement learning that's become widely used, SuperSuit, a library that offers a easy-to-use standardized preprocessing wrappers for interfacing with learning libraries, and extensions to the Arcade Learning Environment (a popular tool which reinforcement learning researchers use to interact with Atari 2600 games) that allows for supporting multiplayer game modes.

Using these tools, this thesis also uses multi-agent reinforcement learning to develop a new tool for natural science research. Emergent behaviors refer to the coordinated behaviors of groups of agents such as pedestrians in a crosswalk, birds in flocking formations, cars in traffic or traders in the stock market, and represent some of the most important things that we generally don't understand across many fields of science. In this work, we introduce the first mathematical formalism for the systematic search of all possible good (``mature'') emergent behaviors within a multi-agent system through multi-agent reinforcement learning (MARL), and create a naive implementation of this search via deep reinforcement learning that can be applied in arbitrary environments. We show that in 12 multi-agent systems, this naive method is able to find over a hundred total emergent behaviors, the majority of which were previously unknown to the environment authors. Such methods could allow for answering various types of open scientific questions, such as "What behaviors are possible in this system", "What specific conditions in this system allow for this kind of emergent behavior", or "How can I change this system to prevent this emergent behavior."

Examining Committee

Chair:	Dr. John Dickerson
Dean's Representative:	Dr. Dan Lathrop
Members:	Dr. Dinesh Manocha
	Dr. Furong Huang
	Dr. Irina Rish

Bio

Jordan Terry is a PhD student specializing in reinforcement learning. During their PhD, they created PettingZoo, good over of maintenance of Gym from OpenAI, founded Swarm Labs, and founded the Farama Foundation that's gone on to take over and manage the majority of the major open source reinforcement learning libraries in the world (details available here: https://farama.org/Announcing-The-Farama-Foundation).

This talk is organized by Tom Hurst