Talks

PhD Defense: Applying Policy Gradient Methods to Open-Ended Domains

Ryan Sullivan

IRB-5105 or https://umd.zoom.us/j/93063337480

Monday, April 7, 2025, 2:00-4:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Deep reinforcement learning (RL) has been successfully used to train agents in complex video game environments including Starcraft 2, Dota 2, Minecraft, and Gran Turismo. Each of these projects utilized curriculum learning to train agents more efficiently. However, systematic investigations of curriculum learning are limited and it is rarely studied outside of toy research environments. Modern RL methods still struggle in stochastic, sparse-reward environments with long planning horizons. This thesis studies these challenges from multiple perspectives to develop a stronger empirical understanding of curriculum learning in complex environments. By introducing novel visualization techniques for reward surfaces and empirically investigating key implementation details, it explores why policy gradient methods alone are insufficient for sparse-reward tasks. These findings motivate the use of curriculum learning to decompose problems into learnable subtasks and to prioritize learnable objectives. Building on these insights, this dissertation presents a general-purpose library for curriculum learning and uses it to evaluate popular automatic curriculum learning algorithms on challenging RL environments. Curricula have historically been effective for training reinforcement learning agents, and a fundamental understanding of automatic curriculum learning is an essential step toward developing generally capable agents in open-ended environments.

Bio

Ryan Sullivan is a CS PhD student advised by John Dickerson. His research centers on using reinforcement learning to play complex videogames, mainly using on curriculum learning and open-ended algorithms. He has interned at Amazon, Google, and Sony AI and contributed to multiple open-source reinforcement learning libraries including Gymnasium, PettingZoo, and the Open RL benchmark.

This talk is organized by Migo Gui