Can conventions improve Hanabi MARL performance?
Augmenting the action space with conventions to improve multi-agent cooperation in Hanabi
December 10, 2024
https://arxiv.org/pdf/2412.06333This research explores improving multi-agent cooperation in the card game Hanabi by incorporating "conventions" – pre-defined, mutually agreed-upon rules – into the agents' action space. These conventions, inspired by human Hanabi strategies, enable implicit communication between agents without direct message passing. The results demonstrate faster learning and improved performance, especially in scenarios with three or more players, as well as better robustness in cross-play (agents cooperating with previously unseen partners).
Key points for LLM-based multi-agent systems:
- Implicit Communication: Conventions offer a way for agents to convey intentions without explicit language, potentially reducing the complexity of communication protocols in LLM-based systems.
- Action Space Augmentation: Adding conventions to the action space offers a higher-level abstraction for decision-making, similar to "options" in reinforcement learning, potentially simplifying the learning process for LLMs.
- Cross-play/Zero-Shot Coordination: The success of convention-based agents in cross-play scenarios suggests their potential for LLM agents to cooperate effectively without prior joint training, enhancing adaptability in dynamic multi-agent environments.