Can grouped training improve large-scale MARL?
GTDE: Grouped Training with Decentralized Execution for Multi-agent Actor-Critic
This paper introduces GTDE (Grouped Training with Decentralized Execution), a new training paradigm for multi-agent reinforcement learning (MARL) designed to address scalability issues in large-scale systems. Instead of using information from all agents (like Centralized Training) or just individual agent information (like Decentralized Training), GTDE dynamically groups agents based on their observation history, allowing them to learn from local information within their respective groups. This adaptive grouping reduces the computational burden associated with large agent populations. Relevant to LLM-based multi-agent systems, GTDE offers a more scalable training approach, particularly relevant for complex scenarios with numerous interacting LLMs, by focusing on relevant local interactions rather than requiring global information exchange. This dynamic grouping, driven by observation history, could allow LLMs to form temporary coalitions or specialize in sub-tasks based on the evolving context of the environment.