How to optimize communication for multi-agent RL?
DCMAC: Demand-aware Customized Multi-Agent Communication via Upper Bound Training
September 12, 2024
https://arxiv.org/pdf/2409.07127- This paper introduces DCMAC, a new method for improving communication efficiency in multi-agent reinforcement learning, especially in environments with limited communication resources.
- Instead of sharing all observations or predicting teammate actions directly (which can be inaccurate or inefficient), DCMAC focuses on:
- Parsing teammate "demands": Agents broadcast small messages encoding their needs, allowing others to understand their goals and provide more relevant information.
- Customized message generation: Agents create tailored messages based on both their own observations and the parsed demands of their teammates.
- Guidance from an "ideal policy": DCMAC trains a separate policy with full observability to guide the learning of individual agents, speeding up the learning process.
This focus on efficient, demand-driven communication and learning from an ideal policy is highly relevant to LLM-based multi-agent systems, where LLMs can be used to generate nuanced, context-aware messages and learn sophisticated communication protocols.