How can PPO fairly distribute rewards in multi-agent games?
Fairness Aware Reinforcement Learning via Proximal Policy Optimization
This paper introduces fair-PPO, a modification of the Proximal Policy Optimization (PPO) reinforcement learning algorithm designed to address unfair reward distribution in multi-agent systems (MAS) where agents have sensitive attributes (e.g., race, gender). It incorporates penalties based on fairness metrics (demographic parity, counterfactual fairness, and conditional statistical parity) into the PPO objective function, encouraging agents to learn policies that balance reward maximization with fairness.
For LLM-based multi-agent systems, this research is relevant because it provides a concrete mechanism for mitigating potential biases arising from sensitive attributes during the training process. By adjusting the penalty parameters within the fair-PPO algorithm, developers can control the trade-off between performance and fairness in LLM-driven interactions, promoting more equitable outcomes. This is particularly important for cooperative and competitive multi-agent applications where LLMs could perpetuate or amplify existing societal biases. The paper also demonstrates the importance of carefully tuning penalty parameters and highlights the challenge of achieving fairness when agent groups are isolated.