Erosion Attack for Adversarial Training to Enhance Semantic Segmentation Robustness

Existing segmentation models exhibit significant vulnerability to adversarial attacks.To improve robustness, adversarial training incorporates adversarial examples into model training. However, existing attack methods consider only global semantic information and ignore contextual semantic relationships within the samples, limiting the effectiveness of adversarial training. To address this issue, we propose EroSeg-AT, a vulnerability-aware adversarial training framework that leverages EroSeg to generate adversarial examples. EroSeg first selects sensitive pixels based on pixel-level confidence and then progressively propagates perturbations to higher-confidence pixels, effectively disrupting the semantic consistency of the samples. Experimental results show that, compared to existing methods, our approach significantly improves attack effectiveness and enhances model robustness under adversarial training.

Key Contributions

EroSeg: a pixel-confidence-aware adversarial attack that progressively propagates perturbations from low-confidence to high-confidence pixels to disrupt semantic consistency in segmentation models
EroSeg-AT: a vulnerability-aware adversarial training framework that integrates EroSeg-generated examples to improve segmentation model robustness
Demonstrates superior attack effectiveness over existing methods and improved post-training robustness on semantic segmentation benchmarks

🛡️ Threat Analysis

Input Manipulation Attack

EroSeg generates adversarial examples at inference time by selectively perturbing sensitive pixels and propagating perturbations to disrupt semantic consistency; EroSeg-AT uses these adversarial examples to harden segmentation models via adversarial training — both the attack generation and the adversarial training defense fall squarely within ML01.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

white_boxtraining_timeinference_timedigitaluntargeted

Applications

2026 0 cit.

Input Manipulation Attack

92%