How can KIMAS improve LLM multi-agent app performance?
KIMAS: A Configurable Knowledge Integrated Multi-Agent System
This paper introduces KIMAS, a configurable multi-agent system designed to enhance retrieval-augmented generation (RAG) for real-world applications, especially knowledge-intensive conversations. KIMAS uses multiple specialized agents for context management, information retrieval from diverse sources (local vector databases, online search engines, APIs), and answer summarization. Key features for LLM-based multi-agent systems include flexible query rewriting based on conversation and knowledge context, efficient knowledge routing using embedding similarity and manual overrides, reranking of retrieved information for filtering before summarization, and a streamlined citation generation process for increased trustworthiness and transparency. KIMAS optimizes the agent pipeline for parallel execution to minimize latency, crucial for real-time applications. The system is demonstrated on three use cases of increasing scale and complexity, highlighting its adaptability and practical value for building robust and efficient LLM-powered knowledge-intensive applications.