Transferable Physical-World Adversarial Patches Against Pedestrian Detection Models

Physical adversarial patch attacks critically threaten pedestrian detection, causing surveillance and autonomous driving systems to miss pedestrians and creating severe safety risks. Despite their effectiveness in controlled settings, existing physical attacks face two major limitations in practice: they lack systematic disruption of the multi-stage decision pipeline, enabling residual modules to offset perturbations, and they fail to model complex physical variations, leading to poor robustness. To overcome these limitations, we propose a novel pedestrian adversarial patch generation method that combines multi-stage collaborative attacks with robustness enhancement under physical diversity, called TriPatch. Specifically, we design a triplet loss consisting of detection confidence suppression, bounding-box offset amplification, and non-maximum suppression (NMS) disruption, which jointly act across different stages of the detection pipeline. In addition, we introduce an appearance consistency loss to constrain the color distribution of the patch, thereby improving its adaptability under diverse imaging conditions, and incorporate data augmentation to further enhance robustness against complex physical perturbations. Extensive experiments demonstrate that TriPatch achieves a higher attack success rate across multiple detector models compared to existing approaches.

Key Contributions

Multi-stage collaborative attack (TriPatch) combining detection confidence suppression, bounding-box offset amplification, and NMS disruption via triplet loss
Appearance consistency loss constraining patch color distribution for robustness under diverse physical imaging conditions
Transferable physical adversarial patches achieving higher attack success rates across multiple pedestrian detector models

🛡️ Threat Analysis

Input Manipulation Attack

Creates adversarial patches that cause misclassification/missed detections in pedestrian detection systems at inference time through input manipulation. The attack targets the detection pipeline with triplet loss (confidence suppression, bbox offset, NMS disruption) and is designed for physical-world deployment with robustness constraints.

Details

Domains

vision

Model Types

cnn

Threat Tags

inference_timeuntargetedphysicalblack_box

Applications

2026 0 cit.

Input Manipulation Attack

92%