Can LLMs learn opponent costs in real-time games?
PACE: A Framework for Learning and Control in Linear Incomplete-Information Differential Games
April 25, 2025
https://arxiv.org/pdf/2504.17128This paper proposes PACE (Peer-Aware Cost Estimation), a framework for two AI agents to learn each other's goals during a continuous interaction, even when they start without knowing each other's intentions. It focuses on scenarios where agents can observe the shared environment's state but not each other's direct actions. PACE models both agents as learners, simulating each other's learning process to avoid biased estimations. Theoretical guarantees of convergence and stability are provided under specific conditions.
Key points for LLM-based multi-agent systems:
- Incomplete Information: Addresses the realistic scenario of agents needing to infer each other's goals from limited observations.
- Focus on Shared State: Relies only on observing the shared environment state, which is often more practical than observing other agents' actions, particularly with LLMs where internal states are not always directly accessible.
- Model-Based Approach: Leverages a simplified model-based approach (linear dynamics, quadratic costs) suitable for initial exploration and understanding in more complex LLM interaction scenarios.
- Learning Dynamics: The core idea of modeling the learning dynamics of other agents could be potentially adapted to LLM-based systems by incorporating principles of how LLMs learn and adapt during interactions.
- Theoretical Foundation: Offers theoretical guarantees which could serve as a starting point for developing more robust and predictable LLM-based multi-agent systems.