Can agents improve VLM robustness and reduce hallucinations?
HYDRA: AN AGENTIC REASONING APPROACH FOR ENHANCING ADVERSARIAL ROBUSTNESS AND MITIGATING HALLUCINATIONS IN VISION-LANGUAGE MODELS
Hydra, a novel framework, enhances the robustness of Vision-Language Models (VLMs) against adversarial attacks and hallucinations (inaccurate generated information). It achieves this using an agentic approach where an LLM-based agent interacts with multiple vision models, iteratively refining its outputs through reasoning and cross-verification. This framework improves factual accuracy and mitigates both adversarial and intrinsic errors in VLMs by utilizing structured critique loops, in-context learning, and chain-of-thought reasoning. It offers a training-free, modular approach applicable to diverse VLM architectures and demonstrates superior performance to existing dehallucination methods while enhancing adversarial robustness. The core innovation is the integration of reasoning, external verification (across multiple models), and iterative refinement within a single LLM-agent for enhanced VLM robustness.