Can LLMs learn better dispatching rules from big data?
MULTI-AGENT DECISION TRANSFORMERS FOR DYNAMIC DISPATCHING IN MATERIAL HANDLING SYSTEMS LEVERAGING ENTERPRISE BIG DATA
November 6, 2024
https://arxiv.org/pdf/2411.02584This paper explores using multiple independent Decision Transformers (a type of offline reinforcement learning model) as decentralized dispatching agents in a simulated material handling system. The goal is to improve system throughput by learning from existing data generated by simpler heuristics.
Key LLM/Multi-Agent points:
- Decentralized Approach: Each Decision Transformer acts as an independent agent controlling a specific part of the system, mimicking a multi-agent setup.
- Offline Learning: The agents learn from pre-collected data, avoiding the complexities and risks of online training.
- Data Dependency: The quality and characteristics of the training data (e.g., randomness, performance level of original heuristics) significantly impact the agents' effectiveness. Deterministic, high-performing data yields the best results.
- Sequence Modeling: Decision Transformers treat the dispatching problem as a sequence modeling task, leveraging the transformer architecture's ability to handle sequential data.
- Challenges: The research highlights challenges like the difficulty of achieving desired target throughputs consistently and the need for high-quality data, even with deterministic heuristics. State stochasticity seems less problematic than action stochasticity for these vanilla Decision Transformers.