defense 2025

TriQDef: Disrupting Semantic and Gradient Alignment to Prevent Adversarial Patch Transferability in Quantized Neural Networks

Amira Guesmi 1, Bassem Ouni 2, Muhammad Shafique 1

0 citations

α

Published on arXiv

2508.12132

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

TriQDef reduces adversarial patch Attack Success Rate by over 40% on unseen patch and quantization combinations while preserving clean accuracy.

TriQDef

Novel technique introduced


Quantized Neural Networks (QNNs) are increasingly deployed in edge and resource-constrained environments due to their efficiency in computation and memory usage. While shown to distort the gradient landscape and weaken conventional pixel-level attacks, it provides limited robustness against patch-based adversarial attacks-localized, high-saliency perturbations that remain surprisingly transferable across bit-widths. Existing defenses either overfit to fixed quantization settings or fail to address this cross-bit generalization vulnerability. We introduce \textbf{TriQDef}, a tri-level quantization-aware defense framework designed to disrupt the transferability of patch-based adversarial attacks across QNNs. TriQDef consists of: (1) a Feature Disalignment Penalty (FDP) that enforces semantic inconsistency by penalizing perceptual similarity in intermediate representations; (2) a Gradient Perceptual Dissonance Penalty (GPDP) that explicitly misaligns input gradients across bit-widths by minimizing structural and directional agreement via Edge IoU and HOG Cosine metrics; and (3) a Joint Quantization-Aware Training Protocol that unifies these penalties within a shared-weight training scheme across multiple quantization levels. Extensive experiments on CIFAR-10 and ImageNet demonstrate that TriQDef reduces Attack Success Rates (ASR) by over 40\% on unseen patch and quantization combinations, while preserving high clean accuracy. Our findings underscore the importance of disrupting both semantic and perceptual gradient alignment to mitigate patch transferability in QNNs.


Key Contributions

  • Feature Disalignment Penalty (FDP) that enforces semantic inconsistency between clean and perturbed intermediate representations to block patch transferability
  • Gradient Perceptual Dissonance Penalty (GPDP) that minimizes structural and directional gradient agreement across bit-widths using Edge IoU and HOG Cosine metrics
  • Joint Quantization-Aware Training Protocol that unifies both penalties in a shared-weight scheme across multiple quantization levels to block cross-bit-width generalization

🛡️ Threat Analysis

Input Manipulation Attack

The paper directly defends against adversarial patch attacks — localized high-saliency perturbations that cause misclassification at inference time. TriQDef is a defense that reduces Attack Success Rate of patch-based adversarial examples by over 40% across unseen quantization levels.


Details

Domains
vision
Model Types
cnn
Threat Tags
black_boxinference_timedigitalphysical
Datasets
CIFAR-10ImageNet
Applications
image classificationedge inference on quantized models