Can MARL optimize parallel machine scheduling?
Exploring Multi-Agent Reinforcement Learning for Unrelated Parallel Machine Scheduling
This paper explores using Multi-Agent Reinforcement Learning (MARL) to solve complex scheduling problems, specifically the Unrelated Parallel Machine Scheduling problem with setup times and resource constraints. It compares single-agent and multi-agent RL algorithms, finding that while single-agent methods (particularly Maskable PPO) perform well in simpler scenarios, multi-agent approaches (like MAPPO) show greater promise for scaling to larger, more complex problems. The key takeaway for LLM-based multi-agent systems is the potential of MARL for coordinating multiple agents in complex environments with dynamic variables, though cooperative learning remains a challenge that requires further research. The use of a centralized critic during training, even with decentralized execution, proved beneficial in the multi-agent setting.