Frequency Bias Matters: Diving into Robust and Generalized Deep Image Forgery Detection
Chi Liu 1, Tianqing Zhu 1, Wanlei Zhou 1, Wei Zhao 2
Published on arXiv
2511.19886
Output Integrity Attack
OWASP ML Top 10 — ML09
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
A single frequency alignment method simultaneously serves as a transferable black-box attack evading multiple GAN forgery detectors and a universal defense improving detector generalizability and robustness across 12 detectors and 8 GAN-based forgery models
Frequency Alignment
Novel technique introduced
As deep image forgery powered by AI generative models, such as GANs, continues to challenge today's digital world, detecting AI-generated forgeries has become a vital security topic. Generalizability and robustness are two critical concerns of a forgery detector, determining its reliability when facing unknown GANs and noisy samples in an open world. Although many studies focus on improving these two properties, the root causes of these problems have not been fully explored, and it is unclear if there is a connection between them. Moreover, despite recent achievements in addressing these issues from image forensic or anti-forensic aspects, a universal method that can contribute to both sides simultaneously remains practically significant yet unavailable. In this paper, we provide a fundamental explanation of these problems from a frequency perspective. Our analysis reveals that the frequency bias of a DNN forgery detector is a possible cause of generalization and robustness issues. Based on this finding, we propose a two-step frequency alignment method to remove the frequency discrepancy between real and fake images, offering double-sided benefits: it can serve as a strong black-box attack against forgery detectors in the anti-forensic context or, conversely, as a universal defense to improve detector reliability in the forensic context. We also develop corresponding attack and defense implementations and demonstrate their effectiveness, as well as the effect of the frequency alignment method, in various experimental settings involving twelve detectors, eight forgery models, and five metrics.
Key Contributions
- Identifies DNN frequency bias — over-reliance on high-frequency spectral artifacts — as the shared root cause of both generalization failure (unknown GANs) and robustness failure (noisy/adversarial samples) in image forgery detectors
- Proposes a two-step frequency alignment method that removes frequency discrepancy between real and fake images, functioning as a dual-use tool: a strong black-box anti-forensic attack or a universal forensic defense
- Demonstrates effectiveness across 12 detectors, 8 forgery models, and 5 metrics, covering both attack and defense scenarios
🛡️ Threat Analysis
The anti-forensic component is a black-box adversarial evasion attack: it crafts frequency-aligned fake images that cause forgery detector classifiers to misclassify fake as real at inference time, satisfying the adversarial input manipulation definition.
Core contribution is within AI-generated content detection — analyzes root causes of failure in GAN forgery detectors and proposes defenses to improve their generalizability and robustness in the forensic direction.