Can AI learn to play games *better* than Nash equilibrium?
Preference-CFR: Beyond Nash Equilibrium for Better Game Strategies
This paper introduces Preference-CFR (Pref-CFR), an algorithm that extends the Counterfactual Regret Minimization (CFR) algorithm commonly used in game AI. Pref-CFR aims to generate more diverse and customizable AI strategies that go beyond simply achieving Nash Equilibrium (NE). It introduces preference and vulnerability parameters to control the AI's style, allowing developers to create AI agents with distinct characteristics, like aggressive or passive play in poker. The key points for LLM-based multi-agent systems are that Pref-CFR facilitates creating diverse agent behaviors, going beyond the limitations of standard CFR, which tends to converge to a single, predictable strategy. This diversity and customization could prove invaluable in creating more realistic and engaging multi-agent interactions in web applications leveraging LLMs. The ability to specify agent preferences opens the door to fine-grained control over agent interactions and potentially new application areas for multi-agent systems in web development.