How can VLMs improve AMoD dispatching and motion planning?
CoDriveVLM: VLM-Enhanced Urban Cooperative Dispatching and Motion Planning for Future Autonomous Mobility on Demand Systems
This paper introduces CoDriveVLM, a framework for managing fleets of autonomous vehicles (CAVs) in a Mobility-on-Demand (AMoD) system. It addresses the challenges of dynamic passenger requests, route planning, and collision avoidance in complex urban environments.
CoDriveVLM uses Vision-Language Models (VLMs) to enhance decision-making. VLMs process information from BEV images (bird's-eye view maps) annotated with vehicle and passenger locations and textual descriptions of the scenario. This multi-modal input enables the VLM to assign CAVs to passengers (dispatching) and assess collision risks. A hybrid system combining VLM dispatching with an optimization-based approach using ADMM (Alternating Direction Method of Multipliers) allows for efficient, decentralized control of the CAVs, enabling them to navigate complex scenarios while avoiding collisions and minimizing travel times. A memory module storing past VLM interactions allows for few-shot learning and improved performance over time.