From Detection to Correction: Backdoor-Resilient Face Recognition via Vision-Language Trigger Detection and Noise-Based Neutralization

Biometric systems, such as face recognition systems powered by deep neural networks (DNNs), rely on large and highly sensitive datasets. Backdoor attacks can subvert these systems by manipulating the training process. By inserting a small trigger, such as a sticker, make-up, or patterned mask, into a few training images, an adversary can later present the same trigger during authentication to be falsely recognized as another individual, thereby gaining unauthorized access. Existing defense mechanisms against backdoor attacks still face challenges in precisely identifying and mitigating poisoned images without compromising data utility, which undermines the overall reliability of the system. We propose a novel and generalizable approach, TrueBiometric: Trustworthy Biometrics, which accurately detects poisoned images using a majority voting mechanism leveraging multiple state-of-the-art large vision language models. Once identified, poisoned samples are corrected using targeted and calibrated corrective noise. Our extensive empirical results demonstrate that TrueBiometric detects and corrects poisoned images with 100\% accuracy without compromising accuracy on clean images. Compared to existing state-of-the-art approaches, TrueBiometric offers a more practical, accurate, and effective solution for mitigating backdoor attacks in face recognition systems.

Key Contributions

Majority voting mechanism over multiple large VLMs to accurately detect poisoned (backdoor-triggered) training images in face recognition datasets
Noise-based corrective neutralization that corrects detected poisoned samples without requiring retraining or degrading clean-image accuracy
Empirical demonstration of 100% detection and correction accuracy on backdoor-poisoned face images across multiple attack types including MakeupAttack

🛡️ Threat Analysis

Model Poisoning

Directly defends against backdoor/trojan attacks in face recognition DNNs — the paper proposes trigger detection via VLM majority voting and corrective noise to neutralize poisoned training samples with trigger patterns (stickers, makeup, patterned masks).

Details

Domains

vision

Model Types

vlmcnn

Threat Tags

training_timetargeteddigital

Applications

2025 0 cit.

Model Poisoning

79%

From Detection to Correction: Backdoor-Resilient Face Recognition via Vision-Language Trigger Detection and Noise-Based Neutralization

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

NT-ML: Backdoor Defense via Non-target Label Training and Mutual Learning

Isolate Trigger: Detecting and Eliminating Adaptive Backdoor Attacks

Backdoor Mitigation via Invertible Pruning Masks

Illuminating the Black Box: Real-Time Monitoring of Backdoor Unlearning in CNNs via Explainable AI

Improving the Sensitivity of Backdoor Detectors via Class Subspace Orthogonalization

Robust Backdoor Removal by Reconstructing Trigger-Activated Changes in Latent Representation

Unsupervised Backdoor Detection and Mitigation for Spiking Neural Networks

Proactive Disentangled Modeling of Trigger-Object Pairings for Backdoor Defense