Can my multi-agent system handle adversarial delays?
Provably Stable Multi-Agent Routing with Bounded-Delay Adversaries in the Decision Loop
April 2, 2025
https://arxiv.org/pdf/2504.00863This paper studies the robustness of multi-agent routing systems, like ride-sharing apps or warehouse robots, when some agents are adversarial (malfunctioning or malicious) and cause delays. It identifies how many adversarial agents a system can tolerate before becoming unstable (requests piling up indefinitely) and derives a method to calculate how many additional cooperative agents are needed to restore stability when facing a certain percentage of adversarial agents.
Key points for LLM-based multi-agent systems:
- Real-world applicability: The bounded-delay adversary model realistically reflects potential issues in LLM agents, like hallucinations or excessively long responses, which can be interpreted as delays.
- Stability as a key metric: The focus on system stability is crucial for LLM-based applications, as unstable systems become unusable.
- Fleet sizing for robustness: The paper provides a way to estimate the necessary "fleet size" (number of LLMs) to maintain a functional application even with some malfunctioning LLMs. This is crucial for deploying robust real-world LLM applications.
- Centralized vs. decentralized control: While the paper uses a centralized control system, its insights on adversary impact and fleet sizing can inform the design of more decentralized, LLM-driven multi-agent systems.
- Potential for LLM-driven mitigation: Future research directions suggested in the paper, such as adversary detection and adaptive mitigation strategies, could leverage LLMs themselves to improve system robustness. LLMs could be used to monitor other LLMs, identify unusual behavior indicative of adversarial actions, or dynamically adjust resource allocation to compensate for delays.