UV-Attack: Physical-World Adversarial Attacks for Person Detection via Dynamic-NeRF-based UV Mapping

In recent research, adversarial attacks on person detectors using patches or static 3D model-based texture modifications have struggled with low success rates due to the flexible nature of human movement. Modeling the 3D deformations caused by various actions has been a major challenge. Fortunately, advancements in Neural Radiance Fields (NeRF) for dynamic human modeling offer new possibilities. In this paper, we introduce UV-Attack, a groundbreaking approach that achieves high success rates even with extensive and unseen human actions. We address the challenge above by leveraging dynamic-NeRF-based UV mapping. UV-Attack can generate human images across diverse actions and viewpoints, and even create novel actions by sampling from the SMPL parameter space. While dynamic NeRF models are capable of modeling human bodies, modifying clothing textures is challenging because they are embedded in neural network parameters. To tackle this, UV-Attack generates UV maps instead of RGB images and modifies the texture stacks. This approach enables real-time texture edits and makes the attack more practical. We also propose a novel Expectation over Pose Transformation loss (EoPT) to improve the evasion success rate on unseen poses and views. Our experiments show that UV-Attack achieves a 92.7% attack success rate against the FastRCNN model across varied poses in dynamic video settings, significantly outperforming the state-of-the-art AdvCamou attack, which only had a 28.5% ASR. Moreover, we achieve 49.5% ASR on the latest YOLOv8 detector in black-box settings. This work highlights the potential of dynamic NeRF-based UV mapping for creating more effective adversarial attacks on person detectors, addressing key challenges in modeling human movement and texture modification. The code is available at https://github.com/PolyLiYJ/UV-Attack.

Key Contributions

UV-Attack: adversarial clothing texture generation via dynamic-NeRF-based UV mapping that handles arbitrary human poses without direct NeRF backpropagation
Expectation over Pose Transformation (EoPT) loss to improve evasion success on unseen poses and viewpoints by sampling novel poses from SMPL parameter space
Physical attack demonstration via printed Bohemian-style adversarial clothing achieving 92.7% ASR against FastRCNN and 49.5% ASR against YOLOv8 in black-box settings

🛡️ Threat Analysis

Input Manipulation Attack

Proposes gradient-optimized adversarial clothing textures (physical adversarial patches) that cause person detectors to fail at inference time across diverse poses and viewpoints — a direct physical-world input manipulation attack.

Details

Domains

vision

Model Types

cnn

Threat Tags

white_boxblack_boxinference_timephysicaluntargeted

Datasets

dynamic video (custom)SMPL parameter space samples

Applications

2026 0 cit.

Input Manipulation Attack

85%