SPARK-IL: Spectral Retrieval-Augmented RAG for Knowledge-driven Deepfake Detection via Incremental Learning

Detecting AI-generated images remains a significant challenge because detectors trained on specific generators often fail to generalize to unseen models; however, while pixel-level artifacts vary across models, frequency-domain signatures exhibit greater consistency, providing a promising foundation for cross-generator detection. To address this, we propose SPARK-IL, a retrieval-augmented framework that combines dual-path spectral analysis with incremental learning by utilizing a partially frozen ViT-L/14 encoder for semantic representations alongside a parallel path for raw RGB pixel embeddings. Both paths undergo multi-band Fourier decomposition into four frequency bands, which are individually processed by Kolmogorov-Arnold Networks (KAN) with mixture-of-experts for band-specific transformations before the resulting spectral embeddings are fused via cross-attention with residual connections. During inference, this fused embedding retrieves the $k$ nearest labeled signatures from a Milvus database using cosine similarity to facilitate predictions via majority voting, while an incremental learning strategy expands the database and employs elastic weight consolidation to preserve previously learned transformations. Evaluated on the UniversalFakeDetect benchmark across 19 generative models -- including GANs, face-swapping, and diffusion methods -- SPARK-IL achieves a 94.6\% mean accuracy, with the code to be publicly released at https://github.com/HessenUPHF/SPARK-IL.

Key Contributions

Dual-path multi-band spectral architecture combining pixel-level and feature-level representations via FFT and KAN modules
Retrieval-augmented classification using cosine similarity search over spectral signatures in Milvus database
Incremental learning via elastic weight consolidation enabling adaptation to new generators without catastrophic forgetting

🛡️ Threat Analysis

Output Integrity Attack

AI-generated image detection (deepfake detection) - verifying content authenticity and detecting synthetic images produced by generative models.

Details

Domains

visiongenerative

Model Types

gandiffusioncnntransformer

Threat Tags

inference_time

Datasets

UniversalFakeDetect

Applications

2025 0 cit.

Output Integrity Attack

93%