How to build efficient, low-cost super agents?
Toward Super Agent System with Hybrid AI Routers
This paper proposes a "Super Agent System" architecture designed to improve the efficiency and scalability of AI agents powered by Large Language Models (LLMs). The system breaks down complex user requests into smaller tasks handled by specialized agents, which utilize tools, memory, and external resources. A key aspect is a hybrid approach that uses both on-device Small Language Models (SLMs) for quick, privacy-preserving responses and cloud-based LLMs for more complex tasks. The system dynamically routes tasks and selects appropriate LLMs based on task complexity and cost considerations using "intent routing" and "model routing". Automated agentic workflows allow multiple agents to collaborate on complex tasks. The authors propose this architecture as a blueprint for integrating super agents into everyday devices like phones and robots.