Can multi-agent HDRL improve portfolio optimization?
A novel multi-agent dynamic portfolio optimization learning system based on hierarchical deep reinforcement learning
January 14, 2025
https://arxiv.org/pdf/2501.06832This paper introduces a novel multi-agent system for optimizing investment portfolios using hierarchical deep reinforcement learning (HDRL). It aims to improve profitability and manage risk by allocating funds across a basket of stocks (specifically the Dow Jones Industrial Average).
Key points for LLM-based multi-agent systems:
- Hierarchical structure: The system uses HDRL, dividing the complex task of portfolio optimization into sub-tasks handled by different agents. This hierarchical approach is relevant to LLM-based multi-agent systems, where complex tasks can be decomposed and delegated.
- Auxiliary agent: An auxiliary agent is used to pre-train and guide the main agent by finding baseline portfolio allocations, addressing the sparsity of positive rewards often encountered in reinforcement learning. This is analogous to using LLMs to bootstrap or provide initial guidance to other agents in a multi-agent system.
- Addressing curse of dimensionality: The hierarchical structure and auxiliary agent mitigate the curse of dimensionality, a common challenge in reinforcement learning with large action spaces (like portfolio optimization). This has implications for LLM-based systems operating in complex environments.
- Policy learning: The system focuses on learning optimal policies for portfolio allocation, which are rules for making decisions. This relates to how LLMs can be used to generate and refine decision-making policies in multi-agent systems.
- Continuous action space: The system handles a continuous action space (the proportion of funds allocated to each stock), which is also relevant to LLM-based systems that may need to operate in continuous action environments.