CertMask: Certifiable Defense Against Adversarial Patches via Theoretically Optimal Mask Coverage
Xuntao Lyu 1, Ching-Chi Lin 2, Abdullah Al Arafat 1,3, Georg von der Brüggen 2, Jian-Jia Chen 2,4, Zhishan Guo 1,2
1 North Carolina State University
2 Technische Universität Dortmund
Published on arXiv
2511.09834
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
CertMask improves certified robust accuracy by up to +13.4% over PatchCleanser while reducing inference complexity from O(n²) to O(n) using a single round of k-fold coverage masking.
CertMask
Novel technique introduced
Adversarial patch attacks inject localized perturbations into images to mislead deep vision models. These attacks can be physically deployed, posing serious risks to real-world applications. In this paper, we propose CertMask, a certifiably robust defense that constructs a provably sufficient set of binary masks to neutralize patch effects with strong theoretical guarantees. While the state-of-the-art approach (PatchCleanser) requires two rounds of masking and incurs $O(n^2)$ inference cost, CertMask performs only a single round of masking with $O(n)$ time complexity, where $n$ is the cardinality of the mask set to cover an input image. Our proposed mask set is computed using a mathematically rigorous coverage strategy that ensures each possible patch location is covered at least $k$ times, providing both efficiency and robustness. We offer a theoretical analysis of the coverage condition and prove its sufficiency for certification. Experiments on ImageNet, ImageNette, and CIFAR-10 show that CertMask improves certified robust accuracy by up to +13.4\% over PatchCleanser, while maintaining clean accuracy nearly identical to the vanilla model.
Key Contributions
- CertMask: a single-round masking defense achieving O(n) inference cost versus O(n²) for PatchCleanser, while remaining model-agnostic and requiring no retraining
- Theoretical k-fold coverage framework that proves a provably sufficient mask set guarantees every possible adversarial patch location is covered at least k times, with formal necessary and sufficient conditions for certification
- Empirical improvement of certified robust accuracy by up to +13.4% over PatchCleanser on ImageNet, ImageNette, and CIFAR-10 while maintaining near-vanilla clean accuracy
🛡️ Threat Analysis
Adversarial patches are localized perturbations injected at inference time to mislead vision models — a canonical input manipulation attack. CertMask provides a certifiable defense by constructing binary mask sets that guarantee k-fold coverage of any patch location, with provable robustness bounds against worst-case adversarial patches.