Leveraging Failed Samples: A Few-Shot and Training-Free Framework for Generalized Deepfake Detection

Recent deepfake detection studies often treat unseen sample detection as a ``zero-shot" task, training on images generated by known models but generalizing to unknown ones. A key real-world challenge arises when a model performs poorly on unknown samples, yet these samples remain available for analysis. This highlights that it should be approached as a ``few-shot" task, where effectively utilizing a small number of samples can lead to significant improvement. Unlike typical few-shot tasks focused on semantic understanding, deepfake detection prioritizes image realism, which closely mirrors real-world distributions. In this work, we propose the Few-shot Training-free Network (FTNet) for real-world few-shot deepfake detection. Simple yet effective, FTNet differs from traditional methods that rely on large-scale known data for training. Instead, FTNet uses only one fake samplefrom an evaluation set, mimicking the scenario where new samples emerge in the real world and can be gathered for use, without any training or parameter updates. During evaluation, each test sample is compared to the known fake and real samples, and it is classified based on the category of the nearest sample. We conduct a comprehensive analysis of AI-generated images from 29 different generative models and achieve a new SoTA performance, with an average improvement of 8.7\% compared to existing methods. This work introduces a fresh perspective on real-world deepfake detection: when the model struggles to generalize on a few-shot sample, leveraging the failed samples leads to better performance.

Key Contributions

FTNet: a training-free, few-shot deepfake detection framework requiring only one fake sample from the evaluation set — no parameter updates needed
Nearest-neighbor classification paradigm that compares each test sample to known real and fake exemplars at inference time
Comprehensive evaluation across images from 29 generative models achieving 8.7% average improvement over existing state-of-the-art methods

🛡️ Threat Analysis

Output Integrity Attack

Proposes a novel detection framework (FTNet) for identifying AI-generated images (deepfakes from GANs and diffusion models), directly addressing output integrity and content authenticity — the paper's entire contribution is a new AI-generated content detection method.

Details

Domains

visiongenerative

Model Types

gandiffusion

Threat Tags

black_boxinference_time

Datasets

AI-generated images from 29 generative models

Applications

2026 0 cit.

Output Integrity Attack

92%