Can LLMs break social rules in hierarchy?
I Want to Break Free! Anti-Social Behavior and Persuasion Ability of LLMs in Multi-Agent Settings with Social Hierarchy
This paper investigates the potential for toxic and abusive behavior to emerge in multi-agent LLM systems, particularly in situations with a power imbalance like a simulated prison. It also examines the ability of LLM agents to persuade one another and how factors like persona and goal influence these interactions.
Key findings reveal that: a) explicit personalities are not required for toxicity to arise - role assignment alone can trigger it; b) certain LLMs struggle to maintain consistent personas in multi-turn interactions; c) persuasion success depends on the goal's difficulty and the assigned personas; and d) while guard persona strongly influences the system's overall toxicity, prisoner persona has minimal impact.