defense 2025

Trans-defense: Transformer-based Denoiser for Adversarial Defense with Spatial-Frequency Domain Representation

Alik Pramanick , Mayank Bansal , Utkarsh Srivastava , Suklav Ghosh , Arijit Sur

1 citations · 40 references · arXiv

α

Published on arXiv

2510.27245

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Trans-defense substantially exceeds standalone denoising networks and adversarial training baselines in classification accuracy under adversarial attacks on MNIST, CIFAR-10, and Fashion-MNIST.

Trans-defense

Novel technique introduced


In recent times, deep neural networks (DNNs) have been successfully adopted for various applications. Despite their notable achievements, it has become evident that DNNs are vulnerable to sophisticated adversarial attacks, restricting their applications in security-critical systems. In this paper, we present two-phase training methods to tackle the attack: first, training the denoising network, and second, the deep classifier model. We propose a novel denoising strategy that integrates both spatial and frequency domain approaches to defend against adversarial attacks on images. Our analysis reveals that high-frequency components of attacked images are more severely corrupted compared to their lower-frequency counterparts. To address this, we leverage Discrete Wavelet Transform (DWT) for frequency analysis and develop a denoising network that combines spatial image features with wavelets through a transformer layer. Next, we retrain the classifier using the denoised images, which enhances the classifier's robustness against adversarial attacks. Experimental results across the MNIST, CIFAR-10, and Fashion-MNIST datasets reveal that the proposed method remarkably elevates classification accuracy, substantially exceeding the performance by utilizing a denoising network and adversarial training approaches. The code is available at https://github.com/Mayank94/Trans-Defense.


Key Contributions

  • Transformer-based denoising network that combines spatial image features with DWT wavelet sub-bands via cross-attention to suppress adversarial perturbations as a model-agnostic pre-processing step.
  • Observation that high-frequency components of adversarially perturbed images are more severely corrupted than low-frequency components, motivating frequency-aware defense.
  • Two-phase training strategy: (1) train the denoiser on adversarial examples, then (2) retrain the downstream classifier on denoised images for compounded robustness.

🛡️ Threat Analysis

Input Manipulation Attack

The paper directly defends against adversarial input manipulation attacks (FGSM, PGD, C&W-style perturbations) on image classifiers. The proposed Trans-defense denoiser is an input purification defense — a pre-processing step that removes adversarial perturbations before the image reaches the classifier.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
white_boxinference_timedigital
Datasets
MNISTCIFAR-10Fashion-MNIST
Applications
image classification