Does simple trump complex? Comparing strategies for adversarial robustness in DNNs

Deep Neural Networks (DNNs) have shown substantial success in various applications but remain vulnerable to adversarial attacks. This study aims to identify and isolate the components of two different adversarial training techniques that contribute most to increased adversarial robustness, particularly through the lens of margins in the input space -- the minimal distance between data points and decision boundaries. Specifically, we compare two methods that maximize margins: a simple approach which modifies the loss function to increase an approximation of the margin, and a more complex state-of-the-art method (Dynamics-Aware Robust Training) which builds upon this approach. Using a VGG-16 model as our base, we systematically isolate and evaluate individual components from these methods to determine their relative impact on adversarial robustness. We assess the effect of each component on the model's performance under various adversarial attacks, including AutoAttack and Projected Gradient Descent (PGD). Our analysis on the CIFAR-10 dataset reveals which elements most effectively enhance adversarial robustness, providing insights for designing more robust DNNs.

Key Contributions

Systematic component-level evaluation of DyART (Xu et al.) and large-margin loss (Elsayed et al.) adversarial training techniques to isolate which elements drive robustness
Identification of the most impactful components of margin-based adversarial training on CIFAR-10 using VGG-16
Practical insights into which aspects of margin maximization matter most for designing adversarially robust classifiers

🛡️ Threat Analysis

Input Manipulation Attack

Paper analyzes defenses against adversarial example attacks — specifically margin-based adversarial training techniques evaluated against AutoAttack and PGD, the canonical input manipulation attacks for image classifiers.

Details

Domains

vision

Model Types

cnn

Threat Tags

white_boxinference_timedigital

Datasets

CIFAR-10

Applications

2025 0 cit.

Input Manipulation Attack

91%