How can LLMs learn fair resource allocation?
DECAF: Learning to be Fair in Multi-agent Resource Allocation
This paper introduces DECAF, a framework for training fairer resource allocation policies in multi-agent reinforcement learning systems. It focuses on scenarios where agents evaluate actions individually (distributed evaluation), but a central controller allocates resources based on those evaluations and global constraints (centralized allocation), much like a large language model coordinating multiple agents. Key points for LLM-based multi-agent systems include: methods for incorporating fairness into the central allocation process; learning separate utility and fairness estimators to enable online trade-off adjustments post-training; and adjusting existing utility functions (like those provided by an LLM) to improve fairness without retraining. DECAF's adaptability to various fairness metrics and its handling of constrained resource allocation are directly relevant to developing LLM-based multi-agent applications where fairness and resource management are critical.