defense 2026

A Geometric Probe of the Accuracy-Robustness Trade-off: Sharp Boundaries in Symmetry-Breaking Dimensional Expansion

Yu Bai , Zhe Wang , Jiarui Zhang , Dong-Xiao Zhang , Yinjun Gao , Jun-Jie Zhang

0 citations · 32 references · arXiv (Cornell University)

α

Published on arXiv

2602.17948

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Mask projection restores robustness lost due to SBDE (e.g., recovering accuracy from ~90% under PGD attack back to near clean accuracy of 95.63% on CIFAR-10 with ResNet-18) by neutralizing adversarial perturbations concentrated in auxiliary dimensions.

Mask Projection

Novel technique introduced


The trade-off between clean accuracy and adversarial robustness is a pervasive phenomenon in deep learning, yet its geometric origin remains elusive. In this work, we utilize Symmetry-Breaking Dimensional Expansion (SBDE) as a controlled probe to investigate the mechanism underlying this trade-off. SBDE expands input images by inserting constant-valued pixels, which breaks translational symmetry and consistently improves clean accuracy (e.g., from $90.47\%$ to $95.63\%$ on CIFAR-10 with ResNet-18) by reducing parameter degeneracy. However, this accuracy gain comes at the cost of reduced robustness against iterative white-box attacks. By employing a test-time \emph{mask projection} that resets the inserted auxiliary pixels to their training values, we demonstrate that the vulnerability stems almost entirely from the inserted dimensions. The projection effectively neutralizes the attacks and restores robustness, revealing that the model achieves high accuracy by creating \emph{sharp boundaries} (steep loss gradients) specifically along the auxiliary axes. Our findings provide a concrete geometric explanation for the accuracy-robustness paradox: the optimization landscape deepens the basin of attraction to improve accuracy but inevitably erects steep walls along the auxiliary degrees of freedom, creating a fragile sensitivity to off-manifold perturbations.


Key Contributions

  • Identifies that SBDE-induced accuracy gains create sharp loss boundaries along auxiliary pixel dimensions that are disproportionately exploited by white-box attacks
  • Proposes test-time mask projection that resets inserted auxiliary pixels to training constants, effectively neutralizing white-box adversarial perturbations
  • Provides a concrete geometric explanation for the accuracy-robustness paradox: deeper basins of attraction on the signal manifold are accompanied by steep walls in auxiliary directions

🛡️ Threat Analysis

Input Manipulation Attack

Paper investigates vulnerability to iterative white-box adversarial attacks (PGD) and proposes mask projection as a test-time defense that restores robustness by resetting adversarially exploited auxiliary dimensions to their training values.


Details

Domains
vision
Model Types
cnn
Threat Tags
white_boxinference_timeuntargeteddigital
Datasets
CIFAR-10
Applications
image classification