How to scale MARL for many agents efficiently?
PERFORMANT, MEMORY EFFICIENT AND SCALABLE MULTI-AGENT REINFORCEMENT LEARNING
This research introduces Sable, a new algorithm for training multi-agent AI systems that excels in performance, memory efficiency, and scalability. It achieves this by using a "retention" mechanism, similar to attention in transformers but more memory-efficient, allowing it to handle long sequences of actions and observations, crucial for partially observable environments.
Sable's relevance to LLM-based multi-agent systems lies in its efficient handling of thousands of agents, its ability to learn from entire episodes of interactions, and its performance exceeding existing state-of-the-art methods, including transformers, while maintaining memory efficiency comparable to simpler approaches. This opens possibilities for complex LLM-based multi-agent applications that were previously hindered by computational constraints.