Talks

Towards Principled Sequential Decision-Making

Qinghua Liu

IRB 4105 or https://umd.zoom.us/j/95853135696?pwd=VVEwMVpxeElXeEw0ckVlSWNOMVhXdz09

Thursday, March 28, 2024, 11:00 am-12:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

Sequential decision-making studies how intelligent agents ought to make decisions in a dynamic environment to achieve their objectives. Its diverse applications range from robotics and nuclear plasma control to discovering faster matrix multiplication algorithms and fine-tuning language models (LLMs). In this talk, I will delve into my research on the theoretical foundations of sequential decision-making.

Firstly, I will talk about reinforcement learning with generic nonlinear function approximation, a widely used approach for solving real-world decision-making problems characterized by enormous state spaces. I will demonstrate that the classical Fitted Q-iteration algorithm (the prototype of DQN), combined with the idea of global optimism, is provably sample-efficient in solving a diverse range of problems. In the second part, I will focus on partially observable decision-making in the framework of POMDP, a problem that has long been considered intractable within the theory community due to numerous hardness results. Contrary to this belief, I will reveal a rich class of POMDPs that are of practical interest and can be solved within polynomial samples using a variant of the classical maximum likelihood estimation algorithm. Finally, I will turn to multi-agent decision-making in the framework of Markov Game, where agents must learn to strategically cooperate or compete. I will introduce a fully decentralized algorithm capable of learning equilibria strategy with nearly minimax-optimal sample efficiency.

Bio

Qinghua Liu is a sixth-year Ph.D. student in the Department of Electrical and Computer Engineering at Princeton University, advised by Chi Jin. He works on the theoretical foundations of sequential decision-making. His research has developed simple and generic algorithms that effectively tackle three fundamental challenges in decision-making: large state spaces, partial observability and multi-agency, all while providing reliability guarantees. His research has been recognized with the Princeton SEAS award and a best-paper award at the ICLR 2022 MARL workshop.

This talk is organized by Samuel Malede Zewdu