defense 2025

MemLoss: Enhancing Adversarial Training with Recycling Adversarial Examples

Soroush Mahdi 1, Maryam Amirmazlaghani 1, Saeed Saravani 1, Zahra Dehghanian 2

0 citations · 19 references · arXiv

α

Published on arXiv

2510.09105

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

MemLoss achieves higher clean accuracy than existing adversarial training methods (TRADES, HAT, AWP) while maintaining competitive adversarial robustness on CIFAR-10.

MemLoss

Novel technique introduced


In this paper, we propose a new approach called MemLoss to improve the adversarial training of machine learning models. MemLoss leverages previously generated adversarial examples, referred to as 'Memory Adversarial Examples,' to enhance model robustness and accuracy without compromising performance on clean data. By using these examples across training epochs, MemLoss provides a balanced improvement in both natural accuracy and adversarial robustness. Experimental results on multiple datasets, including CIFAR-10, demonstrate that our method achieves better accuracy compared to existing adversarial training methods while maintaining strong robustness against attacks.


Key Contributions

  • Introduces MemLoss, a method that retains adversarial examples from prior training epochs ('Memory Adversarial Examples') and reuses them as an additional data source in subsequent epochs.
  • Addresses the accuracy-robustness trade-off without relying on external datasets, by treating stored adversarial examples as a cost-free augmentation source.
  • Demonstrates plug-in compatibility with existing adversarial training frameworks (e.g., TRADES, HAT) and evaluates on CIFAR-10 showing improved clean accuracy and retained adversarial robustness.

🛡️ Threat Analysis

Input Manipulation Attack

Proposes MemLoss, an adversarial training defense specifically designed to improve model robustness against adversarial input manipulation attacks by retaining and reusing previously generated adversarial examples across training epochs.


Details

Domains
vision
Model Types
cnn
Threat Tags
white_boxtraining_timedigital
Datasets
CIFAR-10
Applications
image classification