Ayushi Mehrotra

h-index: 3 36 citations 13 papers (total)

Papers in Database (2)

defense arXiv Nov 24, 2025 · Nov 2025

Adarsh Kumarappan, Ayushi Mehrotra · California Institute of Technology

Probabilistic (k,ε)-unstable certificate tightens SmoothLLM's jailbreak defense guarantees for both GCG and PAIR attacks

Input Manipulation Attack Prompt Injection nlp

1 citations PDF Code

defense arXiv Oct 5, 2025 · Oct 2025

Ayushi Mehrotra, Derek Peng, Dipkamal Bhusal et al. · California Institute of Technology · University of California +1 more

Defends against adversarial patches by masking top concept activation vectors, requiring no prior knowledge of patch size or location

Input Manipulation Attack vision