AgentFoX: LLM Agent-Guided Fusion with eXplainability for AI-Generated Image Detection
Yangxin Yu 1, Yue Zhou 1, Bin Li 1, Kaiqing Lin 1, Haodong Li 1, Jiangqun Ni 2, Bo Cao 3
Published on arXiv
2603.23115
Output Integrity Attack
OWASP ML Top 10 — ML09
Key Finding
Produces explainable forensic reports that integrate multiple detector outputs through LLM-guided reasoning, resolving conflicting judgments from specialized detectors
AgentFoX
Novel technique introduced
The increasing realism of AI-Generated Images (AIGI) has created an urgent need for forensic tools capable of reliably distinguishing synthetic content from authentic imagery. Existing detectors are typically tailored to specific forgery artifacts--such as frequency-domain patterns or semantic inconsistencies--leading to specialized performance and, at times, conflicting judgments. To address these limitations, we present \textbf{AgentFoX}, a Large Language Model-driven framework that redefines AIGI detection as a dynamic, multi-phase analytical process. Our approach employs a quick-integration fusion mechanism guided by a curated knowledge base comprising calibrated Expert Profiles and contextual Clustering Profiles. During inference, the agent begins with high-level semantic assessment, then transitions to fine-grained, context-aware synthesis of signal-level expert evidence, resolving contradictions through structured reasoning. Instead of returning a coarse binary output, AgentFoX produces a detailed, human-readable forensic report that substantiates its verdict, enhancing interpretability and trustworthiness for real-world deployment. Beyond providing a novel detection solution, this work introduces a scalable agentic paradigm that facilitates intelligent integration of future and evolving forensic tools.
Key Contributions
- LLM-agent framework that orchestrates multiple specialized AIGI detectors through semantic reasoning and evidence synthesis
- Quick-integration fusion mechanism guided by Expert Profiles and Clustering Profiles for context-aware detector combination
- Human-readable forensic reports with structured reasoning chains that explain detection verdicts and resolve conflicting expert judgments
🛡️ Threat Analysis
Core contribution is detecting AI-generated images (synthetic content detection) and verifying content authenticity through forensic analysis. The framework integrates multiple forgery detection methods to determine if images are AI-generated, which is quintessential output integrity verification.