How robust are LLM-based recommender agents to memory attacks?
Get the Agents Drunk: Memory Perturbations in Autonomous Agent-based Recommender Systems
This paper explores vulnerabilities in LLM-based multi-agent recommender systems (Agent4RS) by introducing perturbations to agent memory. It introduces DrunkAgent, a novel attack framework consisting of generation, strategy, and surrogate modules. DrunkAgent crafts adversarial text triggers targeting item descriptions to manipulate agent memories, preventing updates and promoting specific items. Key points relevant to LLM-based multi-agent systems include exploiting the memory mechanism for adversarial purposes, the focus on black-box attack scenarios (limited attacker knowledge), and the emphasis on transferability (effective across different Agent4RS) and imperceptibility (difficult to detect) of the adversarial triggers.