Can evolving Q-learning agents cooperate?
Evolutionary Multi-agent Reinforcement Learning in Group Social Dilemmas
November 19, 2024
https://arxiv.org/pdf/2411.10459This paper explores how artificial intelligence agents learn to cooperate in group scenarios, specifically using a game called the "public goods game." It combines reinforcement learning (where agents learn through trial and error) with evolutionary principles (where successful strategies are more likely to be replicated). The research investigates how different learning and evolutionary parameters affect the agents' ability to contribute to the common good, addressing the classic "tragedy of the commons" problem.
Key points for LLM-based multi-agent systems:
- Exploration vs. Exploitation: The "temperature" parameter, controlling how much agents explore new strategies versus sticking to known ones, is crucial for cooperation and can itself evolve over time.
- Reward Functions Matter: How the game rewards contributions significantly impacts the agents' learning process and the overall outcome. Fine-tuning reward functions is key for desired behavior.
- Evolutionary Dynamics Enhance Learning: Combining evolution with reinforcement learning allows for more efficient exploration of the strategy space and can lead to more robust cooperative strategies.
- Analytic Tractability: The research seeks analytically tractable models, aiming to simplify the complex interactions of multi-agent systems and provide more predictable design principles for cooperative AI. This relates to the desire for greater control and predictability in LLM-based agent interactions.
- Beyond Simple Games: While the public goods game is a simplification, the research aims to generalize these findings to more complex real-world scenarios involving multiple interacting AI agents.