Mihai Surdeanu

h-index: 3 35 citations 7 papers (total)

Papers in Database (1)

defense arXiv Jan 24, 2026 · 10w ago

A Lightweight Explainable Guardrail for Prompt Safety

Md Asiful Islam, Mihai Surdeanu · University of Arizona

Lightweight multi-task guardrail that classifies unsafe prompts and highlights which words drive the decision with explainability

Prompt Injection nlp
PDF