How to make consistent story videos with AI agents?
STORYAGENT: Customized STORYTELLING VIDEO GENERATION VIA MULTI-AGENT COLLABORATION
This paper introduces StoryAgent, a multi-agent system for creating customized storytelling videos. It uses several specialized AI agents working together, including a story designer, storyboard generator, video creator, agent manager, and an observer. The system takes a text prompt and reference videos of a subject as input and generates a video featuring that subject acting out the story. Key to LLM-based multi-agent systems is the use of LLMs like GPT-4 for tasks such as story design, agent coordination (managing which agent acts when), and result evaluation. Novel techniques are introduced to maintain subject consistency across video shots, addressing limitations of current methods. One such technique, LoRA-BE, customizes an existing image-to-video model for improved subject fidelity. The storyboard generation uses a "remove and redraw" method to ensure subject consistency across storyboard frames.