Large Population Teams: Control, Equilibria & Learning


2024 CDC Full-Day Workshop

December 15, 2024

About this workshop

In this workshop, we intend to bring together researchers from various disciplines (different areas of engineering, mathematics, and data science) working on the theory and applications of decentralized systems with large number of units or agents under a variety of system and evolution dynamics, information structures, performance criteria, and application areas. A common thread will be to understand the optimality and equilibrium behaviour, scaling behaviour with the number of agents, and learning dynamics; all in both the associated mathematical theory as well as in the context of emerging engineering and applied science applications. Due to the interdisciplinary nature of such problems involving optimal control, stochastic control, game theory, multi-agent systems, and robotics, we intend to establish connections between various formulations adopted in the community and bring researchers who have been using alternative setups, solutions approaches, and applications to allow for exchange of ideas and formulation of new research directions and collaborations. One other main goal of this workshop is to inspire a future generation of researchers in this vibrant field.

Invited Speakers

Image

Tamer Bașar

UIUC

Image

Karthik Elamvazhuthi

Los Alamos National Lab

Image

Mathieu Laurière

NYU Shanghai

Image

Aditya Mahajan

McGill University

Image

Nuno Martins

University of Maryland

Image

Lacra Pavel

University of Toronto

Image

Vijay Subramanian

University of Michigan

Image

Panagiotis Tsiotras

Georgia Tech

Image

Serdar Yüksel

Queen's University

Workshop schedule



All times in CET (UTC + 01)
8:30a Opening Remarks
8:40a Invited Session 1
8:40-9:20a
Tamer Basar (UIUC) - Large Population Games with Hybrid Modes of Behavior

Abstract: Decision making in dynamic uncertain environments with multiple agents arises in many disciplines and application domains, including control, communications, distributed optimization, social networks, and economics. Here, a natural framework, and a comprehensive one, for modeling, optimization, and analysis is the one provided by stochastic dynamic games (SDGs), which accommodates different solution concepts depending on how the interactions among the agents are modeled, particularly whether they are in a cooperative mode (with the same objective functions, as in teams) or in a noncooperative mode (with different objective functions) or a mix of the two, such as teams of agents interacting noncooperatively across different teams (and of course cooperatively within each team). What also affects (strategic) interactions is the asymmetric nature of the information different agents acquire (and do not share or partially share selectively with each other, even within teams), and all this leads to appropriate notions of equilibria tailored to the game at hand. This talk will first discuss the challenges that arise in addressing SDGs with such generality, and then identify a special class of such SDGs that are manageable in the sense of leading to derivation of exact or approximate equilibria. One such subclass of SDGs is characterized by a finite (relatively small) number of competing teams, and with each team having a large population of (possibly indistinguishable) agents with possibly imperfect state measurements. The main focus of this part of the talk will be on linear-quadratic structures.

9:20-10:00a
Karthik Elamvazhuthi (LANL) - Mean-Field Stabilization of Control-Affine Systems to Probability Densities

Abstract: In this talk, we will explore the set of configurations that can be stabilized in the control of large robot swarms with limited sensing. We analyze the stationary distributions of degenerate diffusion processes associated with control-affine systems, which leads to studying the stability of governing partial differential equations. First, we consider the case without mean-field terms. We then show how including mean-field interactions can expand the class of stabilizable distributions. Importantly, the mean-field limit allows us to achieve global stability of target configurations, unlike treatments for finite particles. These results provide insights into the fundamental capabilities and limits of decentralized control for emergent collective behaviors in large-scale robotic systems.

10:00-10:40a
Mathieu Lauriere (NYU Shanghai) - Deep Reinforcement Learning for Mean-field Type Games

Abstract: Mean field games have been introduced to study Nash equilibria in games with a very large number of players. The standard theory focuses on a homogenous population of non-cooperative players. Several extensions have been proposed, such as mean-field type games, which study Nash equilibria between competitive coalitions of mean-field type: the players are cooperative among each group, but they compete with other groups. The limit when the number of coalitions tends to infinity has been studied under the name of mean field control games. In this talk, we present some (deep) reinforcement learning methods for such games with a finite or infinite number of coalitions.

10:40-11:00a Break
11:00a Invited Session 2
11:00-11:40a
Aditya Mahajan (McGill) - Mean-Field Games Among Teams

Abstract: In this talk, we present a model of a game among teams. Each team consists of a homogeneous population of agents. Agents within a team are cooperative while the teams compete with other teams. The dynamics and the costs are coupled through the empirical distribution (or the mean field) of the state of agents in each team, which is assumed to be observed by all agents. Agents have asymmetric information (also called a non-classical information structure). We propose a mean-field based refinement of the Team-Nash equilibrium of the game, which we call mean-field Markov perfect equilibrium (MF-MPE). We identify a dynamic programming decomposition to characterize MF-MPE. We then consider the case where each team has a large number of players and present a mean-field approximation that approximates the game among large-population teams as a game among infinite-population teams. We show that the MF-MPE of the game among teams of infinite population is easier to compute and is an ε-approximate MF-MPE of the game among teams of finite population.

11:40-12:20p
Nuno Martins (UMD) - Incentive Design and Learning in Large Populations: A System- Theoretic Approach With Applications To Epidemic Mitigation

Abstract: This talk will introduce a systematic method for designing incentive mechanisms aimed at influencing large groups of strategic agents (both synthetic and human) towards desirable behaviors. Utilizing concepts from population games, the agents’ reactions to incentives and environmental factors are modeled using simple learning rules. Their behaviors, which evolve over time, are represented through a deterministic evolutionary model, creating a nonlinear feedback system with the incentive mechanisms. This analysis employs control systems tools and includes a novel subsystem interaction, such as an epidemic influenced by the agents’ strategies. A new two-step design approach based on system-theoretic passivity is proposed for creating incentive mechanisms. The first step involves selecting an optimal equilibrium with cost considerations, and the second step focuses on designing a control mechanism to stabilize this equilibrium. The approach is noted for its generality and robustness, functioning effectively even when agents are unaware of the incentive mechanisms and precise learning rules are unknown.

12:20 -1:50p Lunch
1:50p Invited Session 3
1:50-2:30P
Lacra Pavel (U Toronto) - Higher-order Learning in Multi-Agent Games

Abstract: We consider the effect that higher-order learning can have in the context of Mirror descent (MD), a progenitor of many decision processes. While MD converges to the Nash equilibrium (NE) of many games, possibly under limited or uncertain feedback, many applications are characterized by conditions for which MD fails to equilibrate, namely, games with non-strict monotonicity or non-strict variationally stable state. We discuss the design of a set of theoretically-guided methodologies that overcome these convergence barriers, in particular through higher-order variants, designed through a passivity lens. We then consider discrete- time algorithms extracted from these which provide convergence guarantees in two challenging setups: the semi-bandit case, where an agent’s feedback is corrupted by Martingale-difference noise, and the full-bandit case, where each agent only receives a payoff signal that indicates its current performance. Illustrative examples are drawn from several recent applications arising from game-theoretic machine learning, such as the generation of adversarial attacks and generative adversarial networks.

2:30-3:10p
Vijay Subramanian (UMich) - Dynamic Games Among Teams with Delayed Intra-Team Information Sharing

Abstract: We analyze a class of stochastic dynamic games among teams with asymmetric information, where members of a team share their observations internally with a delay of d. Each team is associated with a controlled Markov Chain, whose dynamics are coupled through the players’ actions. These games exhibit challenges in both theory and practice due to the presence of signaling and the increasing domain of information over time. We develop a general approach to characterize a subset of Nash Equilibria where the agents can use a compressed version of their information, instead of the full information, to choose their actions. We identify two subclasses of strategies: Sufficient Private Information Based (SPIB) strategies, which only compress private information, and Compressed Information Based (CIB) strategies, which compress both common and private information. We show that while SPIB-strategy-based equilibria always exist, the same is not true for CIB-strategy-based equilibria. We develop a backward inductive sequential procedure, whose solution (if it exists) provides a CIB strategy- based equilibrium. We identify some instances where we can guarantee the existence of a solution to the above procedure. Our results highlight the tension among compression of in- formation, the existence of (compression-based) equilibria, and backward inductive sequential computation of such equilibria in stochastic dynamic games with asymmetric information.

3:10-3:30p Break
13:30p Invited Session 4
3:30-4:10p
Panagiotis Tsiotras (Georgia Tech) - Zero-Sum Games Between Large-Population Teams under Mean-Field Sharing

Abstract: In this talk, we will investigate the behaviors of two large population teams competing in a discrete environment, The team-level interactions are modeled as a zero-sum game, while the dynamics within each team are formulated as a collaborative mean-field team problem. Following the mean-field literature, we first approximate the large-population team game with its infinite-population limit. By introducing two fictitious coordinators we transform the infinite-population game into an equivalent zero-sum coordinator game. We study the optimal strategies for each team for the infinite population limit via a novel reachability analysis, and we show that the obtained team strategies are decentralized and are ε-optimal for the original finite-population game. We will provide extensions to heterogeneous team scenarios. Finally, we discuss the ramifications of this result for agent training in large-population multi-agent reinforcement learning (MARL) problems.

4:10-4:50p
Serdar Yüksel (Queen's University) - Equivalence between Controlling a Large Population and a Representative Agent: Optimality, Learning, and Games (among Teams)

Abstract: We study stochastic exchangeable teams comprising a finite number of decision-makers (DMs) as well as their mean-field limits under centralized, mean-field sharing, or fully decentralized information structures. i) For finite population exchangeable teams, we establish the existence of a a randomized optimal policy that is exchangeable (permutation invariant). This optimal policy is obtained via value iterations for an equivalent measure-valued controlled Markov decision problem (MDP) under mean-field sharing; ii) We show that a sequence of exchangeable optimal policies for a finite population setting converges to an optimal policy for the infinite population problem which is conditionally symmetric (identical), independent, and decentralized. This result also proves the optimality of the limiting measure-valued MDP for the representative DM. This solves, to our knowledge, an open question on the optimality of symmetric policies which is often apriori assumed in the mean-field control literature.
Three implications will be discussed: (i) Near optimality via quantized approximations to facilitate computations as well as near optimality for large finite numbers of agents. (ii) Implications on the convergence of decentralized Q-learning with local information where it will be shown that decentralized learners with local information are guaranteed to converge to some equilibrium, which can be an objective one, but also a subjective one in the absence of uniqueness of equilibria. (ii) We also discuss an application to games among a finite number of large teams. We again show that a Nash equilibrium exhibits exchangeability in the finite decision-maker regime and symmetry in the infinite one. We endow the set of randomized policies with a suitable topology under various decentralized information structures, which leads to the desired convexity and compactness of the sets of randomized policies leading to a randomized Nash equilibrium that is exchangeable among decision makers within each team and as the number of decision-makers within each team goes to infinity (that is for the mean-field game among teams), using a de Finetti representation theorem, we show existence of a randomized Nash equilibrium that is symmetric (i.e., identical) among decision makers within each team and also independently randomized. We thus show that common randomness is not necessary for large team-against-team games, unlike the case with small-sized teams. The zero-sum game will also be discussed with more relaxed conditions.

4:50-5:15p Rapid-Interactive Session

Decentralized Supervisory Control for the Cooperation of Heterogeneous Field Robots (Authors: Dayeon Yang, Jaewoong Kim, Teahoon Seo and Chanyoung Ju)

Positive Correlation, Counterclockwise Dissipativity, and Convergence (Authors: Matthew S. Hankins, Jair Certório, Nuno C. Martins)

Learning and Approximations for Multi-Agent Team Problems with Borel Spaces under Decentralized Information (Authors: Omar Mrani-Zentar and Serdar Yüksel)

5:15-5:45p Open Discussion and Closing Remarks

Contributed Short Papers

Short paper submissions for presentations at the workshop are invited, in particular from junior participants such as postdoctoral fellows, graduate students, and assistant professors. Each submission should consist of a summary paper (of 2 to 4 pages, including references) and a list of authors (all copied in the email submission), so that the technical content can be reviewed by independent referees.

The submission deadline is October 18th.

Submissions should be emailed to yuksel@queensu.ca with subject line: CDC’24 Workshop Poster Submission.

Decisions will be sent out by October 23rd.

Organizers

Image

Aditya Mahajan

Professor

Department of Electrical and Computer Engineering
McGill University

Image

Panagiotis Tsiotras

Professor

School of Aerospace Engineering
Georgia Institute of Technology

Image

Serdar Yüksel

Professor

Department of Mathematics and Statistics
Queen's University

Contact

   If you have any question regarding the workshop, please contact the Workshop Organizers.