How can agents best share and use information efficiently?
M2I2: Learning Efficient Multi-Agent Communication via Masked State Modeling and Intention Inference
This paper introduces M2I2, a new framework for improving communication efficiency in multi-agent reinforcement learning (MARL) systems where agents have limited information and communication resources. M2I2 uses two self-supervised auxiliary tasks: masked state modeling, where agents reconstruct the global state from partial messages, and joint action prediction, where agents predict the combined actions of their teammates. This helps agents learn efficient representations of the overall environment state and predict their teammates' intentions. A key component, the Dimensional Rational Network (DRN), learns which pieces of information are most important, allowing for selective masking and sharing of data, further enhancing communication efficiency.
For LLM-based multi-agent systems, M2I2 offers a mechanism for more efficient communication and coordination between agents. The masked state modeling and joint action prediction tasks can leverage the strengths of LLMs in understanding and generating complex sequences of information. The DRN could be adapted to assess the importance of different parts of LLM-generated messages or internal representations, optimizing bandwidth usage and focusing agents on the most relevant information for collaborative decision-making. This aligns with the increasing interest in using LLMs as agents within multi-agent systems, where communication efficiency is a significant challenge.