defense 2026

SimLBR: Learning to Detect Fake Images by Learning to Detect Real Images

Aayush Dhakal 1, Subash Khanal 1, Srikumar Sastry 1, Jacob Arndt 2, Philipe Ambrozio Dias 2, Dalton Lunga 1, Nathan Jacobs 2

0 citations · 52 references · arXiv (Cornell University)

α

Published on arXiv

2602.20412

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

SimLBR achieves up to +24.85% accuracy and +69.62% recall improvement over state-of-the-art detectors on the Chameleon hard benchmark under strong distribution shift.

SimLBR (Latent Blending Regularization)

Novel technique introduced


The rapid advancement of generative models has made the detection of AI-generated images a critical challenge for both research and society. Recent works have shown that most state-of-the-art fake image detection methods overfit to their training data and catastrophically fail when evaluated on curated hard test sets with strong distribution shifts. In this work, we argue that it is more principled to learn a tight decision boundary around the real image distribution and treat the fake category as a sink class. To this end, we propose SimLBR, a simple and efficient framework for fake image detection using Latent Blending Regularization (LBR). Our method significantly improves cross-generator generalization, achieving up to +24.85\% accuracy and +69.62\% recall on the challenging Chameleon benchmark. SimLBR is also highly efficient, training orders of magnitude faster than existing approaches. Furthermore, we emphasize the need for reliability-oriented evaluation in fake image detection, introducing risk-adjusted metrics and worst-case estimates to better assess model robustness. All code and models will be released on HuggingFace and GitHub.


Key Contributions

  • SimLBR framework using Latent Blending Regularization that models a tight decision boundary around real images, treating fakes as a sink class for improved cross-generator generalization
  • Achieves up to +24.85% accuracy and +69.62% recall on the challenging Chameleon benchmark with orders-of-magnitude faster training than prior methods
  • Introduces risk-adjusted metrics and worst-case estimates for reliability-oriented evaluation of fake image detectors under distribution shift

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses AI-generated content detection — specifically detecting fake images produced by generative models (GANs, diffusion models). The paper's primary contribution is a detection system (SimLBR) and reliability-oriented evaluation metrics for verifying content authenticity, which is the canonical ML09 use case.


Details

Domains
visiongenerative
Model Types
diffusiongancnntransformer
Threat Tags
inference_timedigital
Datasets
Chameleon
Applications
ai-generated image detectiondeepfake detectioncontent authenticity verification