CertMask: Certifiable Defense Against Adversarial Patches via Theoretically Optimal Mask Coverage

Adversarial patch attacks inject localized perturbations into images to mislead deep vision models. These attacks can be physically deployed, posing serious risks to real-world applications. In this paper, we propose CertMask, a certifiably robust defense that constructs a provably sufficient set of binary masks to neutralize patch effects with strong theoretical guarantees. While the state-of-the-art approach (PatchCleanser) requires two rounds of masking and incurs $O(n^2)$ inference cost, CertMask performs only a single round of masking with $O(n)$ time complexity, where $n$ is the cardinality of the mask set to cover an input image. Our proposed mask set is computed using a mathematically rigorous coverage strategy that ensures each possible patch location is covered at least $k$ times, providing both efficiency and robustness. We offer a theoretical analysis of the coverage condition and prove its sufficiency for certification. Experiments on ImageNet, ImageNette, and CIFAR-10 show that CertMask improves certified robust accuracy by up to +13.4\% over PatchCleanser, while maintaining clean accuracy nearly identical to the vanilla model.

Key Contributions

CertMask: a single-round masking defense achieving O(n) inference cost versus O(n²) for PatchCleanser, while remaining model-agnostic and requiring no retraining
Theoretical k-fold coverage framework that proves a provably sufficient mask set guarantees every possible adversarial patch location is covered at least k times, with formal necessary and sufficient conditions for certification
Empirical improvement of certified robust accuracy by up to +13.4% over PatchCleanser on ImageNet, ImageNette, and CIFAR-10 while maintaining near-vanilla clean accuracy

🛡️ Threat Analysis

Input Manipulation Attack

Adversarial patches are localized perturbations injected at inference time to mislead vision models — a canonical input manipulation attack. CertMask provides a certifiable defense by constructing binary mask sets that guarantee k-fold coverage of any patch location, with provable robustness bounds against worst-case adversarial patches.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

inference_timedigitalphysicalwhite_box

Datasets

ImageNetImageNetteCIFAR-10

Applications

2025 0 cit.

Input Manipulation Attack

92%