Can AI agents hide messages beyond space-time?
Steganography Beyond Space-Time With Chain of Multimodal AI Agents
This paper proposes a steganography method for audiovisual media using a chain of multimodal AI agents. It hides messages within the textual representation of the media, leveraging the orthogonality between linguistic and audiovisual domains. The message is encoded by biasing the word sampling of a language model during paraphrasing and decoded by analyzing word probabilities. This approach aims to be robust against manipulations like compression and deepfakes, which could overwrite steganographic changes in the visual or audio signals directly. The key points relevant to LLM-based multi-agent systems are the use of an LLM as a core component for encoding and decoding hidden information through controlled text generation, and the organization of multiple specialized AI models (speech-to-text, text-to-speech, lip-sync, language model) into a collaborative multi-agent system.