AgentFoX: LLM Agent-Guided Fusion with eXplainability for AI-Generated Image Detection

The increasing realism of AI-Generated Images (AIGI) has created an urgent need for forensic tools capable of reliably distinguishing synthetic content from authentic imagery. Existing detectors are typically tailored to specific forgery artifacts--such as frequency-domain patterns or semantic inconsistencies--leading to specialized performance and, at times, conflicting judgments. To address these limitations, we present \textbf{AgentFoX}, a Large Language Model-driven framework that redefines AIGI detection as a dynamic, multi-phase analytical process. Our approach employs a quick-integration fusion mechanism guided by a curated knowledge base comprising calibrated Expert Profiles and contextual Clustering Profiles. During inference, the agent begins with high-level semantic assessment, then transitions to fine-grained, context-aware synthesis of signal-level expert evidence, resolving contradictions through structured reasoning. Instead of returning a coarse binary output, AgentFoX produces a detailed, human-readable forensic report that substantiates its verdict, enhancing interpretability and trustworthiness for real-world deployment. Beyond providing a novel detection solution, this work introduces a scalable agentic paradigm that facilitates intelligent integration of future and evolving forensic tools.

Key Contributions

LLM-agent framework that orchestrates multiple specialized AIGI detectors through semantic reasoning and evidence synthesis
Quick-integration fusion mechanism guided by Expert Profiles and Clustering Profiles for context-aware detector combination
Human-readable forensic reports with structured reasoning chains that explain detection verdicts and resolve conflicting expert judgments

🛡️ Threat Analysis

Output Integrity Attack

Core contribution is detecting AI-generated images (synthetic content detection) and verifying content authenticity through forensic analysis. The framework integrates multiple forgery detection methods to determine if images are AI-generated, which is quintessential output integrity verification.