Can planning boost MARL sample efficiency?
Combining Planning and Reinforcement Learning for Solving Relational Multiagent Domains
This paper introduces MaRePReL, a novel framework combining relational and hierarchical planning with reinforcement learning to enable multiple AI agents to solve complex tasks in environments with varying numbers of objects and relations. It addresses challenges in multi-agent reinforcement learning (MARL), such as the exponential growth of state/action spaces and non-stationarity, by using a relational planner as a centralized controller to decompose tasks and assign sub-tasks to agents. An abstraction mechanism then simplifies the state space for efficient learning by individual deep RL agents. Experiments show that MaRePReL improves sample efficiency, facilitates transfer learning, and generalizes to varying numbers of objects compared to traditional MARL baselines.
Key points for LLM-based multi-agent systems:
- Centralized planning and task decomposition: A hierarchical planner can leverage the reasoning capabilities of LLMs to decompose complex goals into manageable sub-tasks.
- Abstraction for efficient RL: LLMs could be used to automate the state abstraction process, dynamically identifying the relevant information for each agent's sub-task.
- Relational representation: LLMs can readily handle relational data and reason about relationships between objects, making them well-suited for representing complex environments for multi-agent systems.
- Potential for improved generalization and transfer: Combining LLM-based planning with RL could lead to more robust and adaptable multi-agent systems.