MemLoss: Enhancing Adversarial Training with Recycling Adversarial Examples

In this paper, we propose a new approach called MemLoss to improve the adversarial training of machine learning models. MemLoss leverages previously generated adversarial examples, referred to as 'Memory Adversarial Examples,' to enhance model robustness and accuracy without compromising performance on clean data. By using these examples across training epochs, MemLoss provides a balanced improvement in both natural accuracy and adversarial robustness. Experimental results on multiple datasets, including CIFAR-10, demonstrate that our method achieves better accuracy compared to existing adversarial training methods while maintaining strong robustness against attacks.

Key Contributions

Introduces MemLoss, a method that retains adversarial examples from prior training epochs ('Memory Adversarial Examples') and reuses them as an additional data source in subsequent epochs.
Addresses the accuracy-robustness trade-off without relying on external datasets, by treating stored adversarial examples as a cost-free augmentation source.
Demonstrates plug-in compatibility with existing adversarial training frameworks (e.g., TRADES, HAT) and evaluates on CIFAR-10 showing improved clean accuracy and retained adversarial robustness.

🛡️ Threat Analysis

Input Manipulation Attack

Proposes MemLoss, an adversarial training defense specifically designed to improve model robustness against adversarial input manipulation attacks by retaining and reusing previously generated adversarial examples across training epochs.