Monowar Bhuyan

h-index: 2 12 citations 4 papers (total)

Papers in Database (1)

defense arXiv Oct 5, 2025 · Oct 2025

Unmasking Backdoors: An Explainable Defense via Gradient-Attention Anomaly Scoring for Pre-trained Language Models

Anindya Sundar Das, Kangjie Chen, Monowar Bhuyan · Umeå University · Nanyang Technological University

Inference-time backdoor defense for encoder PLMs using combined attention and gradient anomaly scores to detect trigger tokens

Model Poisoning nlp
1 citations PDF