Parameter Interpolation Adversarial Training for Robust Image Classification

Though deep neural networks exhibit superior performance on various tasks, they are still plagued by adversarial examples. Adversarial training has been demonstrated to be the most effective method to defend against adversarial attacks. However, existing adversarial training methods show that the model robustness has apparent oscillations and overfitting issues in the training process, degrading the defense efficacy. To address these issues, we propose a novel framework called Parameter Interpolation Adversarial Training (PIAT). PIAT tunes the model parameters between each epoch by interpolating the parameters of the previous and current epochs. It makes the decision boundary of model change more moderate and alleviates the overfitting issue, helping the model converge better and achieving higher model robustness. In addition, we suggest using the Normalized Mean Square Error (NMSE) to further improve the robustness by aligning the relative magnitude of logits between clean and adversarial examples rather than the absolute magnitude. Extensive experiments conducted on several benchmark datasets demonstrate that our framework could prominently improve the robustness of both Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs).

Key Contributions

PIAT framework that interpolates model parameters between consecutive training epochs to smooth decision boundary evolution and reduce robustness oscillation
Normalized Mean Square Error (NMSE) loss that aligns the relative magnitude of logits between clean and adversarial examples rather than absolute magnitude
Empirical demonstration that PIAT improves adversarial robustness for both CNNs and Vision Transformers across multiple benchmark datasets

🛡️ Threat Analysis

Input Manipulation Attack

Proposes PIAT, an adversarial training defense framework against adversarial examples (input manipulation at inference time); directly addresses oscillation and overfitting in adversarial training to improve certified robustness of image classifiers.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

white_boxinference_timedigital

Datasets

CIFAR-10CIFAR-100Tiny ImageNet

Applications

2026 0 cit.

Input Manipulation Attack

100%

Parameter Interpolation Adversarial Training for Robust Image Classification

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

KoALA: KL-L0 Adversarial Detector via Label Agreement

FS-IQA: Certified Feature Smoothing for Robust Image Quality Assessment

CODED-SMOOTHING: Coding Theory Helps Generalization

Layer-wise Noise Guided Selective Wavelet Reconstruction for Robust Medical Image Segmentation

Dual Randomized Smoothing: Beyond Global Noise Variance

PurSAMERE: Reliable Adversarial Purification via Sharpness-Aware Minimization of Expected Reconstruction Error

Trans-defense: Transformer-based Denoiser for Adversarial Defense with Spatial-Frequency Domain Representation

ShapePuri: Shape Guided and Appearance Generalized Adversarial Purification