benchmark 2026

Adversarial Vulnerability Transcends Computational Paradigms: Feature Engineering Provides No Defense Against Neural Adversarial Transfer

Achraf Hsain , Ahmed Abdelkader , Emmanuel Baldwin Mbaya , Hamoud Aljamaan

0 citations · 19 references · arXiv

α

Published on arXiv

2601.21323

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

All HOG-based classifiers suffer 16.6–59.1% relative accuracy drops under neural adversarial transfer, with FGSM outperforming iterative PGD in 100% of classical ML cases due to PGD overfitting to surrogate-specific features that don't survive HOG extraction.


Deep neural networks are vulnerable to adversarial examples--inputs with imperceptible perturbations causing misclassification. While adversarial transfer within neural networks is well-documented, whether classical ML pipelines using handcrafted features inherit this vulnerability when attacked via neural surrogates remains unexplored. Feature engineering creates information bottlenecks through gradient quantization and spatial binning, potentially filtering high-frequency adversarial signals. We evaluate this hypothesis through the first comprehensive study of adversarial transfer from DNNs to HOG-based classifiers. Using VGG16 as a surrogate, we generate FGSM and PGD adversarial examples and test transfer to four classical classifiers (KNN, Decision Tree, Linear SVM, Kernel SVM) and a shallow neural network across eight HOG configurations on CIFAR-10. Our results strongly refute the protective hypothesis: all classifiers suffer 16.6%-59.1% relative accuracy drops, comparable to neural-to-neural transfer. More surprisingly, we discover attack hierarchy reversal--contrary to patterns where iterative PGD dominates FGSM within neural networks, FGSM causes greater degradation than PGD in 100% of classical ML cases, suggesting iterative attacks overfit to surrogate-specific features that don't survive feature extraction. Block normalization provides partial but insufficient mitigation. These findings demonstrate that adversarial vulnerability is not an artifact of end-to-end differentiability but a fundamental property of image classification systems, with implications for security-critical deployments across computational paradigms.


Key Contributions

  • First systematic evaluation of L∞-bounded adversarial transfer from CNNs (VGG16) to HOG-based classifiers across four classical ML models and eight HOG configurations, showing 16.6–59.1% relative accuracy drops
  • Discovery of 'attack hierarchy reversal': FGSM causes greater degradation than PGD in 100% of classical ML transfer cases, the inverse of patterns observed in neural-to-neural transfer
  • Systematic HOG parameter sensitivity analysis (cell size, orientation bins, block normalization) showing block normalization offers partial but insufficient mitigation against transferred perturbations

🛡️ Threat Analysis

Input Manipulation Attack

Paper studies transferability of gradient-based adversarial examples (FGSM, PGD generated on VGG16) to HOG-based classical ML classifiers at inference time — this is squarely an input manipulation / adversarial transfer study, with the novel finding that feature engineering provides no defense.


Details

Domains
vision
Model Types
cnntraditional_ml
Threat Tags
white_boxblack_boxinference_timeuntargeteddigital
Datasets
CIFAR-10
Applications
image classification