Can cross-environment training enable zero-shot multi-agent cooperation?
Cross-environment Cooperation Enables Zero-shot Multi-agent Coordination
This paper explores how to train AI agents that can effectively cooperate with new partners in unfamiliar environments, a concept called Zero-Shot Coordination (ZSC). It introduces Cross-Environment Cooperation (CEC), where agents train via self-play across diverse, procedurally generated environments, rather than with diverse partners in a single environment (as in Population-Based Training or PBT).
Key points for LLM-based multi-agent systems: CEC promotes learning general cooperative "norms" instead of overfitting to specific partners or environments. This is achieved by exposing the agent to many variations of a task, encouraging a broader understanding of its structure. The results suggest that CEC agents exhibit better generalization to new tasks and partners (including humans), even outperforming specialized models in human-AI collaboration evaluations. The method leverages the power of procedurally generated environments for scalable and efficient training, crucial for large language models. The procedural generation approach showcased can be directly applied to LLM-based scenarios to evaluate and enhance cooperative skills in simulated environments before real-world deployment.