AdaptDel: Adaptable Deletion Rate Randomized Smoothing for Certified Robustness

We consider the problem of certified robustness for sequence classification against edit distance perturbations. Naturally occurring inputs of varying lengths (e.g., sentences in natural language processing tasks) present a challenge to current methods that employ fixed-rate deletion mechanisms and lead to suboptimal performance. To this end, we introduce AdaptDel methods with adaptable deletion rates that dynamically adjust based on input properties. We extend the theoretical framework of randomized smoothing to variable-rate deletion, ensuring sound certification with respect to edit distance. We achieve strong empirical results in natural language tasks, observing up to 30 orders of magnitude improvement to median cardinality of the certified region, over state-of-the-art certifications.

Key Contributions

AdaptDel: a randomized smoothing framework with input-adaptive deletion rates for certified robustness against edit distance perturbations
Theoretical extension of randomized smoothing to variable-rate deletion with sound edit distance certification guarantees
Up to 30 orders of magnitude improvement in median cardinality of the certified region over state-of-the-art fixed-rate methods on NLP tasks

🛡️ Threat Analysis

Input Manipulation Attack

The paper defends against adversarial perturbations (edit distance attacks) on text sequence inputs at inference time, providing certified robustness guarantees via randomized smoothing — a canonical ML01 defense scenario.