How to train MARL for dynamic agents?
FLICKERFUSION: INTRA-TRAJECTORY DOMAIN GENERALIZING MULTI-AGENT RL
This research addresses the problem of multi-agent reinforcement learning (MARL) systems failing to generalize to scenarios with a dynamic number of agents or obstacles encountered during operation (inference). The proposed solution, "FlickerFusion," stochastically drops entities from the agents' observations during training, preparing the model to handle unseen entity compositions during inference.
FlickerFusion is particularly relevant to LLM-based multi-agent systems because it introduces a new, orthogonal way to inject inductive bias into MARL systems without relying on complex attention mechanisms or modifying network architectures. It demonstrates that input manipulation can be more effective than adding model parameters for domain generalization in multi-agent scenarios. This approach could potentially improve the robustness and adaptability of LLM-based agents in dynamic, real-world environments.