How can I scale communication in large-agent RL systems?
Exponential Topology-Enabled Scalable Communication in Multi-Agent Reinforcement Learning
This paper introduces ExpoComm, a new communication method for multi-agent reinforcement learning (MARL) designed to be scalable for large numbers of agents. It uses a fixed "exponential" communication topology, inspired by graph theory, allowing messages to quickly reach all agents with low overhead. It employs memory-based message processing (RNNs or attention) and auxiliary tasks (global state prediction or contrastive learning) to help agents learn meaningful communication.
For LLM-based multi-agent systems, ExpoComm offers a potentially scalable communication solution. The focus on efficient information spread and grounded messages aligns with challenges in coordinating LLMs. The use of memory-based message processors could be adapted for the sequential nature of LLM communication. The auxiliary tasks offer a blueprint for training LLMs to communicate effectively, even without direct supervision on message content. The fixed topology simplifies deployment compared to learned communication structures, which could be computationally expensive with numerous LLMs.