How can I build a scalable, reward-driven LLM multi-agent system?
ReSo: A Reward-driven Self-organizing LLM-based Multi-Agent System for Reasoning Tasks
ReSo is a new system designed to improve the reasoning abilities of Large Language Models (LLMs) by having them work together as a team (multi-agent system). It breaks down complex problems into smaller parts, assigns each part to the best-suited LLM agent, and then combines the results. A key innovation is a “Collaborative Reward Model” that learns how well the agents are working together and uses this information to improve team performance over time. ReSo also creates benchmark datasets to test multi-agent reasoning skills. This system achieves significantly better results on complex reasoning tasks compared to using single LLMs or existing multi-agent systems. It dynamically assigns tasks, adapts to the strengths of different LLMs, and learns from data without hand-crafted instructions. The results also showcase that ReSo is more efficient in terms of the number of tokens (pieces of text) it uses compared to other comparable systems.