How can LLMs learn to solve multi-agent problems?
Grounded Answers for Multi-agent Decision-making Problem through Generative World Model
October 4, 2024
https://arxiv.org/pdf/2410.02664This research paper introduces Learning before Interaction (LBI), a system that combines a language-guided simulator with multi-agent reinforcement learning (MARL). The simulator learns both the dynamics of the environment and the reward function based on images and text descriptions, allowing it to generate grounded and explainable solutions for complex multi-agent decision-making problems.
This approach is relevant to LLM-based multi-agent systems because it demonstrates:
- The ability to ground LLM outputs in simulated environments, enhancing their reasoning capabilities and generating realistic interaction sequences.
- The use of a world model comprising separate dynamics and reward models, facilitating adaptability to new tasks by changing reward functions without retraining the dynamics model.
- The potential to overcome limitations of current LLMs in multi-agent decision making, which often produce sketchy or misleading answers due to the lack of trial-and-error experience.