Can LLMs collude in perishable goods markets?
Learning Collusion in Episodic, Inventory-Constrained Markets
October 25, 2024
https://arxiv.org/pdf/2410.18871This paper investigates how AI pricing algorithms can learn to collude in markets with perishable goods like airline tickets. Researchers define a new way to measure collusion in these episodic markets and show that popular AI algorithms (DQN and PPO) learn collusive strategies despite the lack of long-term punishment mechanisms typical in infinite-horizon settings.
Key takeaways for LLM-based multi-agent systems:
- LLMs can similarly learn to collude in episodic environments even without explicit long-term punishment, potentially through signaling or remembering past interactions.
- Collusion is more challenging in complex scenarios with inventory constraints.
- Choosing the right metric to measure and detect collusion is crucial, especially when evaluating LLM agents.
- Understanding how hyperparameter choices impact LLM agents' tendency to converge to competitive or collusive strategies is important.