defense 2025

Parameter Interpolation Adversarial Training for Robust Image Classification

Xin Liu 1, Yichen Yang 1, Kun He 1, John E. Hopcroft 2

9 citations · 1 influential · 47 references · TIFS

α

Published on arXiv

2511.00836

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Parameter interpolation between epochs produces smoother decision boundary changes, alleviating overfitting and achieving higher adversarial robustness on both CNNs and ViTs compared to standard adversarial training baselines.

PIAT (Parameter Interpolation Adversarial Training)

Novel technique introduced


Though deep neural networks exhibit superior performance on various tasks, they are still plagued by adversarial examples. Adversarial training has been demonstrated to be the most effective method to defend against adversarial attacks. However, existing adversarial training methods show that the model robustness has apparent oscillations and overfitting issues in the training process, degrading the defense efficacy. To address these issues, we propose a novel framework called Parameter Interpolation Adversarial Training (PIAT). PIAT tunes the model parameters between each epoch by interpolating the parameters of the previous and current epochs. It makes the decision boundary of model change more moderate and alleviates the overfitting issue, helping the model converge better and achieving higher model robustness. In addition, we suggest using the Normalized Mean Square Error (NMSE) to further improve the robustness by aligning the relative magnitude of logits between clean and adversarial examples rather than the absolute magnitude. Extensive experiments conducted on several benchmark datasets demonstrate that our framework could prominently improve the robustness of both Convolutional Neural Networks (CNNs) and Vision Transformers (ViTs).


Key Contributions

  • PIAT framework that interpolates model parameters between consecutive training epochs to smooth decision boundary evolution and reduce robustness oscillation
  • Normalized Mean Square Error (NMSE) loss that aligns the relative magnitude of logits between clean and adversarial examples rather than absolute magnitude
  • Empirical demonstration that PIAT improves adversarial robustness for both CNNs and Vision Transformers across multiple benchmark datasets

🛡️ Threat Analysis

Input Manipulation Attack

Proposes PIAT, an adversarial training defense framework against adversarial examples (input manipulation at inference time); directly addresses oscillation and overfitting in adversarial training to improve certified robustness of image classifiers.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
white_boxinference_timedigital
Datasets
CIFAR-10CIFAR-100Tiny ImageNet
Applications
image classification