AdaptDel: Adaptable Deletion Rate Randomized Smoothing for Certified Robustness
Zhuoqun Huang , Neil G. Marchant , Olga Ohrimenko , Benjamin I. P. Rubinstein
Published on arXiv
2511.09316
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
AdaptDel achieves up to 30 orders of magnitude improvement in median certified region cardinality over state-of-the-art fixed-rate deletion smoothing methods on natural language tasks.
AdaptDel
Novel technique introduced
We consider the problem of certified robustness for sequence classification against edit distance perturbations. Naturally occurring inputs of varying lengths (e.g., sentences in natural language processing tasks) present a challenge to current methods that employ fixed-rate deletion mechanisms and lead to suboptimal performance. To this end, we introduce AdaptDel methods with adaptable deletion rates that dynamically adjust based on input properties. We extend the theoretical framework of randomized smoothing to variable-rate deletion, ensuring sound certification with respect to edit distance. We achieve strong empirical results in natural language tasks, observing up to 30 orders of magnitude improvement to median cardinality of the certified region, over state-of-the-art certifications.
Key Contributions
- AdaptDel: a randomized smoothing framework with input-adaptive deletion rates for certified robustness against edit distance perturbations
- Theoretical extension of randomized smoothing to variable-rate deletion with sound edit distance certification guarantees
- Up to 30 orders of magnitude improvement in median cardinality of the certified region over state-of-the-art fixed-rate methods on NLP tasks
🛡️ Threat Analysis
The paper defends against adversarial perturbations (edit distance attacks) on text sequence inputs at inference time, providing certified robustness guarantees via randomized smoothing — a canonical ML01 defense scenario.