DNA: Uncovering Universal Latent Forgery Knowledge

As generative AI achieves hyper-realism, superficial artifact detection has become obsolete. While prevailing methods rely on resource-intensive fine-tuning of black-box backbones, we propose that forgery detection capability is already encoded within pre-trained models rather than requiring end-to-end retraining. To elicit this intrinsic capability, we propose the discriminative neural anchors (DNA) framework, which employs a coarse-to-fine excavation mechanism. First, by analyzing feature decoupling and attention distribution shifts, we pinpoint critical intermediate layers where the focus of the model logically transitions from global semantics to local anomalies. Subsequently, we introduce a triadic fusion scoring metric paired with a curvature-truncation strategy to strip away semantic redundancy, precisely isolating the forgery-discriminative units (FDUs) inherently imprinted with sensitivity to forgery traces. Moreover, we introduce HIFI-Gen, a high-fidelity synthetic benchmark built upon the very latest models, to address the lag in existing datasets. Experiments demonstrate that by solely relying on these anchors, DNA achieves superior detection performance even under few-shot conditions. Furthermore, it exhibits remarkable robustness across diverse architectures and against unseen generative models, validating that waking up latent neurons is more effective than extensive fine-tuning.

Key Contributions

DNA framework that uses coarse-to-fine excavation (layer localization + triadic fusion scoring) to isolate sparse forgery-discriminative units (FDUs) already latent in pre-trained vision models, eliminating the need for full fine-tuning
Triadic fusion scoring metric with curvature-truncation strategy integrating gradient sensitivity, activation magnitude, and weight contribution to precisely pinpoint forensically relevant neurons
HIFI-Gen benchmark built on the latest high-fidelity generative models to address the recency lag in existing forgery detection datasets

🛡️ Threat Analysis

Output Integrity Attack

Primary contribution is a novel AI-generated image detection framework (DNA) that identifies forgery-discriminative units inside pre-trained backbones, directly addressing AI-generated content detection and output authenticity — a core ML09 concern.

Details

Domains

vision

Model Types

transformerdiffusion

Threat Tags

inference_timedigital

Datasets

HIFI-Gen

Applications

2026 0 cit.

Output Integrity Attack

83%