SilentDrift: Exploiting Action Chunking for Stealthy Backdoor Attacks on Vision-Language-Action Models

Vision-Language-Action (VLA) models are increasingly deployed in safety-critical robotic applications, yet their security vulnerabilities remain underexplored. We identify a fundamental security flaw in modern VLA systems: the combination of action chunking and delta pose representations creates an intra-chunk visual open-loop. This mechanism forces the robot to execute K-step action sequences, allowing per-step perturbations to accumulate through integration. We propose SILENTDRIFT, a stealthy black-box backdoor attack exploiting this vulnerability. Our method employs the Smootherstep function to construct perturbations with guaranteed C2 continuity, ensuring zero velocity and acceleration at trajectory boundaries to satisfy strict kinematic consistency constraints. Furthermore, our keyframe attack strategy selectively poisons only the critical approach phase, maximizing impact while minimizing trigger exposure. The resulting poisoned trajectories are visually indistinguishable from successful demonstrations. Evaluated on the LIBERO, SILENTDRIFT achieves a 93.2% Attack Success Rate with a poisoning rate under 2%, while maintaining a 95.3% Clean Task Success Rate.

Key Contributions

Identifies a novel vulnerability class in VLA architectures: action chunking combined with delta pose representations creates an intra-chunk visual open-loop that allows per-step perturbations to accumulate through integration.
Proposes the Smootherstep-based perturbation construction with guaranteed C2 continuity (zero velocity and acceleration at trajectory boundaries) to produce kinematically consistent, visually indistinguishable poisoned demonstrations.
Introduces a keyframe attack strategy that selectively poisons only the critical approach phase, minimizing trigger exposure while maximizing attack impact — achieving 93.2% ASR at under 2% poisoning rate on LIBERO.

🛡️ Threat Analysis

Data Poisoning Attack

The attack vector is training data corruption: the authors inject poisoned demonstration trajectories at under 2% poisoning rate into the training dataset, making data poisoning an explicit and evaluated component of the attack (not merely implicit to backdoor injection).

Model Poisoning

SILENTDRIFT is a targeted backdoor attack that poisons VLA training demonstrations to embed hidden malicious behavior (trajectory deviation) that activates on a specific trigger while maintaining 95.3% clean task success — textbook backdoor/trojan injection.