MemLoss: Enhancing Adversarial Training with Recycling Adversarial Examples
Soroush Mahdi 1, Maryam Amirmazlaghani 1, Saeed Saravani 1, Zahra Dehghanian 2
Published on arXiv
2510.09105
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
MemLoss achieves higher clean accuracy than existing adversarial training methods (TRADES, HAT, AWP) while maintaining competitive adversarial robustness on CIFAR-10.
MemLoss
Novel technique introduced
In this paper, we propose a new approach called MemLoss to improve the adversarial training of machine learning models. MemLoss leverages previously generated adversarial examples, referred to as 'Memory Adversarial Examples,' to enhance model robustness and accuracy without compromising performance on clean data. By using these examples across training epochs, MemLoss provides a balanced improvement in both natural accuracy and adversarial robustness. Experimental results on multiple datasets, including CIFAR-10, demonstrate that our method achieves better accuracy compared to existing adversarial training methods while maintaining strong robustness against attacks.
Key Contributions
- Introduces MemLoss, a method that retains adversarial examples from prior training epochs ('Memory Adversarial Examples') and reuses them as an additional data source in subsequent epochs.
- Addresses the accuracy-robustness trade-off without relying on external datasets, by treating stored adversarial examples as a cost-free augmentation source.
- Demonstrates plug-in compatibility with existing adversarial training frameworks (e.g., TRADES, HAT) and evaluates on CIFAR-10 showing improved clean accuracy and retained adversarial robustness.
🛡️ Threat Analysis
Proposes MemLoss, an adversarial training defense specifically designed to improve model robustness against adversarial input manipulation attacks by retaining and reusing previously generated adversarial examples across training epochs.