log in  |  register  |  feedback?  |  help  |  web accessibility
Logo
Reinforcement Learning in Two-Player Zero-Sum Games
Tuesday, April 4, 2023, 4:00-5:00 pm Calendar
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
Registration requested: The organizer of this talk requests that you register if you are planning to attend. There are two ways to register: (1) You can create an account on this site (click the "register" link in the upper-right corner) and then register for this talk; or (2) You can enter your details below and click the "Register for talk" button. Either way, you can always cancel your registration later.

Name:
Email:
Organization:

Abstract

This work studies an algorithm, which we call magnetic mirror descent, that is inspired by mirror descent and the non-Euclidean proximal gradient algorithm. Our contribution is demonstrating the virtues of magnetic mirror descent as both an equilibrium solver and as an approach to reinforcement learning in two-player zero-sum games. These virtues include: 1) Being the first quantal response equilibria solver to achieve linear convergence for extensive-form games with first order feedback; 2) Being the first standard reinforcement learning algorithm to achieve empirically competitive results with CFR in tabular settings; 3) Achieving favorable performance in 3x3 Dark Hex and Phantom Tic-Tac-Toe as a self-play deep reinforcement learning algorithm.

 

 

Bio

Samuel Sokota is a PhD student in the Machine Learning Department at Carnegie Mellon University supervised by Dr. Zico Kolter. Previously, he did his master’s at the University of Alberta, where he worked with Dr. Marc Lanctot and Dr. Martha White. Before that, he did his undergraduate at Swarthmore College, where he played lacrosse, studied mathematics and physics, and worked with Bryce Wiedenbeck. His website is https://ssokota.github.io/

 

This talk is organized by Saptarashmi Bandyopadhyay