Does simple trump complex? Comparing strategies for adversarial robustness in DNNs
William Brooks 1,2,3, Marelie H. Davel 1,2,4, Coenraad Mouton 1,2
2 Centre for Artificial Intelligence Research
3 South African National Space Agency
4 National Institute for Theoretical and Computational Sciences
Published on arXiv
2508.18019
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
Component-wise ablation on CIFAR-10 with VGG-16 reveals which elements of margin-based adversarial training (DyART vs. large-margin loss) most effectively enhance robustness under AutoAttack and PGD evaluation
Dynamics-Aware Robust Training (DyART)
Novel technique introduced
Deep Neural Networks (DNNs) have shown substantial success in various applications but remain vulnerable to adversarial attacks. This study aims to identify and isolate the components of two different adversarial training techniques that contribute most to increased adversarial robustness, particularly through the lens of margins in the input space -- the minimal distance between data points and decision boundaries. Specifically, we compare two methods that maximize margins: a simple approach which modifies the loss function to increase an approximation of the margin, and a more complex state-of-the-art method (Dynamics-Aware Robust Training) which builds upon this approach. Using a VGG-16 model as our base, we systematically isolate and evaluate individual components from these methods to determine their relative impact on adversarial robustness. We assess the effect of each component on the model's performance under various adversarial attacks, including AutoAttack and Projected Gradient Descent (PGD). Our analysis on the CIFAR-10 dataset reveals which elements most effectively enhance adversarial robustness, providing insights for designing more robust DNNs.
Key Contributions
- Systematic component-level evaluation of DyART (Xu et al.) and large-margin loss (Elsayed et al.) adversarial training techniques to isolate which elements drive robustness
- Identification of the most impactful components of margin-based adversarial training on CIFAR-10 using VGG-16
- Practical insights into which aspects of margin maximization matter most for designing adversarially robust classifiers
🛡️ Threat Analysis
Paper analyzes defenses against adversarial example attacks — specifically margin-based adversarial training techniques evaluated against AutoAttack and PGD, the canonical input manipulation attacks for image classifiers.