Transferable Dual-Domain Feature Importance Attack against AI-Generated Image Detector
Weiheng Zhu 1,2, Gang Cao 1,2, Jing Liu 3, Lifang Yu 4, Shaowei Weng 5
1 Communication University of China
2 State Key Laboratory of Media Convergence and Communication
3 Hunan University of Information Technology
Published on arXiv
2511.15571
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
DuFIA consistently outperforms existing traditional and intermediate-level adversarial attacks in black-box transfer settings across diverse AIGI detectors including CNN-based and CLIP-based architectures
DuFIA (Dual-domain Feature Importance Attack)
Novel technique introduced
Recent AI-generated image (AIGI) detectors achieve impressive accuracy under clean condition. In view of antiforensics, it is significant to develop advanced adversarial attacks for evaluating the security of such detectors, which remains unexplored sufficiently. This letter proposes a Dual-domain Feature Importance Attack (DuFIA) scheme to invalidate AIGI detectors to some extent. Forensically important features are captured by the spatially interpolated gradient and frequency-aware perturbation. The adversarial transferability is enhanced by jointly modeling spatial and frequency-domain feature importances, which are fused to guide the optimization-based adversarial example generation. Extensive experiments across various AIGI detectors verify the cross-model transferability, transparency and robustness of DuFIA.
Key Contributions
- Dual-domain feature importance attack (DuFIA) that jointly leverages spatial (interpolated gradient) and frequency-domain perturbations to generate transferable adversarial examples against AIGI detectors
- Domain-aware fusion mechanism that combines spatial and frequency feature importance maps to selectively emphasize semantically critical and cross-model-transferable features
- Demonstrated superior cross-model black-box transferability over existing traditional and intermediate-level attack (ILA) methods across multiple AIGI detector architectures
🛡️ Threat Analysis
DuFIA crafts optimization-based adversarial perturbations using spatially interpolated gradients and frequency-aware perturbations to cause AIGI detectors to misclassify AI-generated images as real at inference time — a direct adversarial evasion attack. The core contribution is the novel dual-domain feature importance mechanism for generating transferable adversarial examples, which is the defining characteristic of ML01.