Handcrafted Feature Fusion for Reliable Detection of AI-Generated Images

The rapid progress of generative models has enabled the creation of highly realistic synthetic images, raising concerns about authenticity and trust in digital media. Detecting such fake content reliably is an urgent challenge. While deep learning approaches dominate current literature, handcrafted features remain attractive for their interpretability, efficiency, and generalizability. In this paper, we conduct a systematic evaluation of handcrafted descriptors, including raw pixels, color histograms, Discrete Cosine Transform (DCT), Histogram of Oriented Gradients (HOG), Local Binary Patterns (LBP), Gray-Level Co-occurrence Matrix (GLCM), and wavelet features, on the CIFAKE dataset of real versus synthetic images. Using 50,000 training and 10,000 test samples, we benchmark seven classifiers ranging from Logistic Regression to advanced gradient-boosted ensembles (LightGBM, XGBoost, CatBoost). Results demonstrate that LightGBM consistently outperforms alternatives, achieving PR-AUC 0.9879, ROC-AUC 0.9878, F1 0.9447, and a Brier score of 0.0414 with mixed features, representing strong gains in calibration and discrimination over simpler descriptors. Across three configurations (baseline, advanced, mixed), performance improves monotonically, confirming that combining diverse handcrafted features yields substantial benefit. These findings highlight the continued relevance of carefully engineered features and ensemble learning for detecting synthetic images, particularly in contexts where interpretability and computational efficiency are critical.

Key Contributions

Systematic benchmark of seven handcrafted feature descriptors (raw pixels, color histograms, DCT, HOG, LBP, GLCM, wavelets) across three configuration settings on the CIFAKE dataset
Demonstrates that fusing diverse handcrafted features monotonically improves detection, with LightGBM achieving PR-AUC 0.9879 and F1 0.9447
Highlights interpretability and computational efficiency advantages of handcrafted feature pipelines over deep learning approaches for synthetic image detection

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses AI-generated image detection (synthetic image forensics), a canonical ML09 problem. Evaluates spectral and texture handcrafted descriptors specifically chosen to expose generative model artifacts, which constitutes forensic technique analysis rather than mere domain application of an off-the-shelf detector.

Details

Domains

vision

Model Types

traditional_mlgandiffusion

Threat Tags

inference_time

Datasets

CIFAKE

Applications

2025 1 cit.

Output Integrity Attack

69%