How can LLMs build better podcasting agents?
PodAgent: A Comprehensive Framework for Podcast Generation
March 4, 2025
https://arxiv.org/pdf/2503.00455This paper introduces PodAgent, a framework for automatically generating podcast-like audio programs using a multi-agent system. It tackles content depth, natural dialogue flow, appropriate voice selection, and expressive speech synthesis.
Key to LLM-based multi-agent systems are: a Host-Guest-Writer agent system for generating conversational scripts based on topic and guest profiles, voice-role matching to align voices with speaker characteristics, and LLM-driven speech synthesis for expressiveness through specified speaking styles. The system is evaluated using both quantitative text metrics and qualitative LLM-based judging, demonstrating significant improvement over baseline approaches.