Erosion Attack for Adversarial Training to Enhance Semantic Segmentation Robustness
Yufei Song 1, Ziqi Zhou 1, Menghao Deng 2, Yifan Hu 1, Shengshan Hu 1, Minghui Li 1, Leo Yu Zhang 3
Published on arXiv
2601.14950
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
EroSeg-AT achieves significantly higher attack effectiveness and model robustness under adversarial training compared to existing segmentation adversarial training methods.
EroSeg-AT
Novel technique introduced
Existing segmentation models exhibit significant vulnerability to adversarial attacks.To improve robustness, adversarial training incorporates adversarial examples into model training. However, existing attack methods consider only global semantic information and ignore contextual semantic relationships within the samples, limiting the effectiveness of adversarial training. To address this issue, we propose EroSeg-AT, a vulnerability-aware adversarial training framework that leverages EroSeg to generate adversarial examples. EroSeg first selects sensitive pixels based on pixel-level confidence and then progressively propagates perturbations to higher-confidence pixels, effectively disrupting the semantic consistency of the samples. Experimental results show that, compared to existing methods, our approach significantly improves attack effectiveness and enhances model robustness under adversarial training.
Key Contributions
- EroSeg: a pixel-confidence-aware adversarial attack that progressively propagates perturbations from low-confidence to high-confidence pixels to disrupt semantic consistency in segmentation models
- EroSeg-AT: a vulnerability-aware adversarial training framework that integrates EroSeg-generated examples to improve segmentation model robustness
- Demonstrates superior attack effectiveness over existing methods and improved post-training robustness on semantic segmentation benchmarks
🛡️ Threat Analysis
EroSeg generates adversarial examples at inference time by selectively perturbing sensitive pixels and propagating perturbations to disrupt semantic consistency; EroSeg-AT uses these adversarial examples to harden segmentation models via adversarial training — both the attack generation and the adversarial training defense fall squarely within ML01.