Talks

Diversifying AI: Towards Creative Chess with AlphaZero

IRB-5165 Brendan Iribe Center for Computer Science and Engineering (IRB)

Tuesday, April 9, 2024, 12:00-1:00 pm

You are subscribed to this talk through .
You are watching this talk through .
You are subscribed to this talk. (unsubscribe, watch)
You are watching this talk. (unwatch, subscribe)
You are not subscribed to this talk. (watch, subscribe)

Abstract

In recent years, Artificial Intelligence (AI) systems have surpassed human intelligence in a variety of computational tasks. However, AI systems, like humans, make mistakes, have blind spots, hallucinate, and struggle to generalize to new situations. This work explores whether AI can benefit from creative decision-making mechanisms when pushed to the limits of its computational rationality. In particular, we investigate whether a team of diverse AI systems can outperform a single AI in challenging tasks by generating more ideas as a group and then selecting the best ones. We study this question in the game of chess, the so-called drosophila of AI. We build on AlphaZero (AZ) and extend it to represent a league of agents via a latent-conditioned architecture, which we call AZ_db. We train AZ_db to generate a wider range of ideas using behavioral diversity techniques and select the most promising ones with sub-additive planning. Our experiments suggest that AZ_db plays chess in diverse ways, solves more puzzles as a group and outperforms a more homogeneous team. Notably, AZ_db solves twice as many challenging puzzles as AZ, including the challenging Penrose positions. When playing chess from different openings, we notice that players in AZ_db specialize in different openings, and that selecting a player for each opening using sub-additive planning results in a 50 Elo improvement over AZ. Our findings suggest that diversity bonuses emerge in teams of AI agents, just as they do in teams of humans and that diversity is a valuable asset in solving computationally hard problems.

Bio

Tom is a senior research scientist at Google DeepMind, in the discovery team, led by Satinder Singh. Before that he was a Ph.D. student at the Technion. Tom is interested in building artificial intelligence systems that make decisions and learn from them. His focus is on the Reinforcement Learning paradigm where he studied aspects of scalability, structure discovery, hierarchy, abstraction, and exploration. He was one of the first researchers to work on RL for Minecraft and Language games. More recently his research focused on learning to RL learn (meta RL), unsupervised RL, convex MDPs, diversity and bounded rationality. More details on Tom's work can be found in his web page: https://tomzahavy.wixsite.com/zahavy.

Note: Please register using the Google Form on our website https://go.umd.edu/marl for access to the Google Meet, Open-source Multi-Agent AI Research Community and talk resources.

This talk is organized by Saptarashmi Bandyopadhyay