log in  |  register  |  feedback?  |  help  |  web accessibility
PhD Proposal: Towards Principled AI-Agents with Decentralized and Asymmetric Information
Xiangyu Liu
[Remote] https://umd.zoom.us/j/7269246363#success
Monday, June 2, 2025, 12:00-2:00 pm
  • You are subscribed to this talk through .
  • You are watching this talk through .
  • You are subscribed to this talk. (unsubscribe, watch)
  • You are watching this talk. (unwatch, subscribe)
  • You are not subscribed to this talk. (watch, subscribe)
Abstract

Abstract:

AI models have been increasingly deployed to develop ``Autonomous Agents'' for decision-making, with prominent application examples including playing Go and video games, robotics, autonomous driving, healthcare, human-assistant, etc. Most such success stories naturally involve multiple AI-agents interacting dynamically with each other and humans. More importantly, these agents oftentimes operate with asymmetric information in practice, both across different agents and across the training-testing phases. In this thesis, we aim to lay the theoretical foundations for principled AI agents operating under asymmetric and decentralized information.

First, we will focus on Reinforcement Learning (RL)-Agents, in multi-agent environments with partially observable and decentralized information. To circumvent the known hardness results and the use of computationally intractable oracles, we advocate leveraging the potential information-sharing among agents. We first establish several computational complexity results to justify the necessity of information-sharing, as well as the observability assumption. Inspired by the inefficiency of planning in the ground-truth model, we then propose to further approximate the shared common information to construct an approximate model of the POSG, in which planning an approximate equilibrium can be quasi-efficient, under the aforementioned assumptions. Furthermore, we develop a partially observable multi-agent RL algorithm that is both statistically and computationally quasi-efficient.

Secondly, we will focus on RL agents in partially observable Markov decision processes when there is privileged information in training, a common practice in robot learning and deep RL. We will firstly revisit two major empirical paradigms, expert distillation (a.k.a. teacher-student learning) and asymmetric actor-critic and demonstrate their pitfalls in finding near-optimal policies. Furthermore, we develop new principled algorithm with both polynomial sample complexity and (quasi)-polynomial computational complexity and revealed the provable benefits of such privileged information.

Finally, we will examine Large-Language-Model (LLM)-(powered-)Agents, which use LLM as the main controller for decision-making, by understanding and enhancing their decision-making capability in canonical decentralized and multi-agent scenarios. In particular, we use the metric of Regret, commonly studied in Online Learning and RL, to understand LLM-agents' decision-making limits in context using controlled experiments. Motivated by the observed pitfalls of existing LLM agents, we also proposed a new fine-tuning loss to promote the no-regret behaviors of the models, both provably and experimentally.

Bio

Xiangyu Liu is a fourth-year PhD student in the Department of Computer Science at the University of Maryland, College Park. His research focuses on reinforcement learning and game theory, with particular interests in their applications to large language models (LLMs).

This talk is organized by Migo Gui