Can LLMs improve MARL without constant calls?
YOLO-MARL: YOU ONLY LLM ONCE FOR MULTI-AGENT REINFORCEMENT LEARNING
This research paper proposes a novel framework called YOLO-MARL (You Only LLM Once for Multi-Agent Reinforcement Learning) that enhances the training of multi-agent AI systems using large language models (LLMs). YOLO-MARL leverages the planning capabilities of LLMs to guide agents' decision-making without requiring constant interaction with the LLM during training. The key idea is to use the LLM once to generate a planning function that maps the system's state to optimal tasks for each agent. This function is then incorporated into the multi-agent reinforcement learning algorithm to provide additional rewards to agents based on their adherence to the planned tasks. This approach significantly reduces the computational overhead and communication instability associated with frequent LLM calls, while improving the overall performance and coordination of the multi-agent system.