How can LLMs generate diverse human-like agents for better cooperation?
Learning to Cooperate with Humans using Generative Agents
This paper introduces GAMMA (Generative Agent Modeling for Multi-agent Adaptation), a new method for training AI agents that can cooperate effectively with humans in tasks requiring coordination. Instead of training agents solely on human data (which is limited) or solely against other AI agents (which can lead to non-human-like strategies), GAMMA trains a generative model to learn a diverse range of partner strategies from both human and simulated data. This generative model then creates a variety of synthetic partners to train a more robust and adaptable "Cooperator" agent. Experiments in a cooperative cooking game show that GAMMA-trained agents perform significantly better with real human partners than agents trained with existing methods.
Key points relevant to LLM-based multi-agent systems:
- Generative Models for Partner Diversity: The core idea of using a generative model to create a wide range of partner behaviors is directly applicable to LLM agents. LLMs could be trained to generate diverse dialogue or action sequences for other agents to train against, enabling robustness to different interaction styles.
- Human-Adaptive Sampling: The paper's method for biasing the generative model towards human data is crucial for LLM-based systems. Even with limited human interaction data, this technique can guide the generation of more human-like partners for training.
- Focus on Adaptability: GAMMA's emphasis on training adaptable agents is highly relevant to LLM applications. The ability to infer a partner's "latent variable" (representing their strategy or style) and adapt accordingly is key for building effective multi-agent LLM systems. This suggests potential uses of techniques like reinforcement learning from human feedback (RLHF) in multi-agent LLM training.
- Challenges of Data Diversity: The paper highlights the "garbage in, garbage out" problem: if the training data is not sufficiently diverse, even a powerful generative model might produce limited or non-representative partners. This is a crucial consideration for LLM-based agents, where the diversity and quality of training data are paramount.