benchmark 2026

Handcrafted Feature Fusion for Reliable Detection of AI-Generated Images

Syed Mehedi Hasan Nirob 1, Moqsadur Rahman 1, Shamim Ehsan 2,1, Summit Haque 1

0 citations · 20 references · arXiv

α

Published on arXiv

2601.19262

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

LightGBM with mixed handcrafted features achieves PR-AUC 0.9879, ROC-AUC 0.9878, and F1 0.9447 on CIFAKE, outperforming all simpler descriptor configurations.

Handcrafted Feature Fusion

Novel technique introduced


The rapid progress of generative models has enabled the creation of highly realistic synthetic images, raising concerns about authenticity and trust in digital media. Detecting such fake content reliably is an urgent challenge. While deep learning approaches dominate current literature, handcrafted features remain attractive for their interpretability, efficiency, and generalizability. In this paper, we conduct a systematic evaluation of handcrafted descriptors, including raw pixels, color histograms, Discrete Cosine Transform (DCT), Histogram of Oriented Gradients (HOG), Local Binary Patterns (LBP), Gray-Level Co-occurrence Matrix (GLCM), and wavelet features, on the CIFAKE dataset of real versus synthetic images. Using 50,000 training and 10,000 test samples, we benchmark seven classifiers ranging from Logistic Regression to advanced gradient-boosted ensembles (LightGBM, XGBoost, CatBoost). Results demonstrate that LightGBM consistently outperforms alternatives, achieving PR-AUC 0.9879, ROC-AUC 0.9878, F1 0.9447, and a Brier score of 0.0414 with mixed features, representing strong gains in calibration and discrimination over simpler descriptors. Across three configurations (baseline, advanced, mixed), performance improves monotonically, confirming that combining diverse handcrafted features yields substantial benefit. These findings highlight the continued relevance of carefully engineered features and ensemble learning for detecting synthetic images, particularly in contexts where interpretability and computational efficiency are critical.


Key Contributions

  • Systematic benchmark of seven handcrafted feature descriptors (raw pixels, color histograms, DCT, HOG, LBP, GLCM, wavelets) across three configuration settings on the CIFAKE dataset
  • Demonstrates that fusing diverse handcrafted features monotonically improves detection, with LightGBM achieving PR-AUC 0.9879 and F1 0.9447
  • Highlights interpretability and computational efficiency advantages of handcrafted feature pipelines over deep learning approaches for synthetic image detection

🛡️ Threat Analysis

Output Integrity Attack

Directly addresses AI-generated image detection (synthetic image forensics), a canonical ML09 problem. Evaluates spectral and texture handcrafted descriptors specifically chosen to expose generative model artifacts, which constitutes forensic technique analysis rather than mere domain application of an off-the-shelf detector.


Details

Domains
vision
Model Types
traditional_mlgandiffusion
Threat Tags
inference_time
Datasets
CIFAKE
Applications
ai-generated image detectiondeepfake detectiondigital media forensics