Can expected return symmetries improve multi-agent coordination?
EXPECTED RETURN SYMMETRIES
This paper introduces "expected return symmetries," a broader class of symmetries in multi-agent systems than previously considered. These symmetries transform policies into other policies that achieve the same expected return in self-play. By training agents to be compatible under these symmetries, the researchers achieve better zero-shot coordination, particularly in scenarios where traditional methods struggle. They also propose algorithms to learn these symmetries directly from agent-environment interactions without prior knowledge of the environment's structure.
For LLM-based multi-agent systems, this research suggests a new way to improve coordination by focusing on expected return symmetries. This is particularly relevant when deploying LLMs in complex environments with unknown or difficult-to-define symmetries, as the symmetries can be learned rather than pre-defined. The focus on expected returns could also translate into more robust and adaptable multi-agent LLM systems.