Localizing Adversarial Attacks To Produces More Imperceptible Noise
Pavan Reddy , Aditya Sanjay Gujral
Published on arXiv
2509.22710
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
Localized attacks with γ=0.25 reduce mean pixel perturbation by ~93.7% and improve SSIM by ~8.7% over global FGSM, with iterative methods (PGD, C&W) maintaining higher ASR under localization constraints
Binary-mask localized adversarial attack
Novel technique introduced
Adversarial attacks in machine learning traditionally focus on global perturbations to input data, yet the potential of localized adversarial noise remains underexplored. This study systematically evaluates localized adversarial attacks across widely-used methods, including FGSM, PGD, and C&W, to quantify their effectiveness, imperceptibility, and computational efficiency. By introducing a binary mask to constrain noise to specific regions, localized attacks achieve significantly lower mean pixel perturbations, higher Peak Signal-to-Noise Ratios (PSNR), and improved Structural Similarity Index (SSIM) compared to global attacks. However, these benefits come at the cost of increased computational effort and a modest reduction in Attack Success Rate (ASR). Our results highlight that iterative methods, such as PGD and C&W, are more robust to localization constraints than single-step methods like FGSM, maintaining higher ASR and imperceptibility metrics. This work provides a comprehensive analysis of localized adversarial attacks, offering practical insights for advancing attack strategies and designing robust defensive systems.
Key Contributions
- Systematic comparative evaluation of localized (binary-mask-constrained) vs. global variants of FGSM, PGD, and C&W attacks across ASR, PSNR, SSIM, and mean pixel perturbation
- Demonstrates localization reduces mean pixel perturbation by up to 93.71% and improves PSNR/SSIM at the cost of modest ASR reduction
- Shows iterative methods (PGD, C&W) are more robust to localization constraints than single-step FGSM
🛡️ Threat Analysis
Directly evaluates gradient-based adversarial perturbation attacks (FGSM, PGD, C&W) at inference time, constrained by binary masks to specific image regions to improve imperceptibility while causing misclassification.