Can LLMs self-learn effective therapy?
Conversational Self-Play for Discovering and Understanding Psychotherapy Approaches
March 24, 2025
https://arxiv.org/pdf/2503.16521This paper explores using two LLMs in a simulated therapy session, one as a therapist and the other as a client with varying levels of depression, to discover and understand different psychotherapy approaches. The LLM therapist was not pre-programmed with specific therapeutic techniques, allowing it to freely adapt its approach.
Key points for LLM-based multi-agent systems:
- Self-play as a tool for policy discovery: This research highlights using self-play to uncover the implicit policies (strategies) embedded within LLMs, rather than for training via reinforcement learning. The LLM agents' interactions reveal how they respond to different situations.
- Emergent therapeutic techniques: While this specific study primarily replicated known therapeutic techniques, the framework is designed for the potential discovery of novel strategies and techniques through future research (e.g., clustering/outlier analysis).
- Adaptive agent behavior: The LLM therapist demonstrated adaptive behavior, tailoring its approach (e.g., SFBT vs. PCT) based on the client's depression severity. This points to the potential for developing multi-agent systems where agents dynamically adjust their behavior based on the context and interaction with other agents.
- Simulation for hypothesis generation: The simulated therapy sessions can be used to generate hypotheses about effective therapeutic techniques, which can then be validated in real-world settings. This emphasizes the value of multi-agent simulations in research.
- Future directions: The research suggests extending this work by (1) incorporating RL for policy improvement, (2) employing specialized, fine-tuned LLMs, (3) analyzing combinations and sequences of techniques, and (4) identifying core and adaptable components of therapies. These directions are highly relevant to the broader field of LLM-based multi-agent development.