Adarsh Kumarappan

h-index: 1 16 citations 5 papers (total)

Papers in Database (2)

benchmark arXiv Nov 24, 2025 · Nov 2025

Automating Deception: Scalable Multi-Turn LLM Jailbreaks

Adarsh Kumarappan, Ananya Mujoo · California Institute of Technology · Evergreen Valley College

Automated pipeline generating 1,500 psychologically-grounded multi-turn FITD jailbreaks; GPT family shows 32pp ASR increase with conversational history

Prompt Injection nlp
2 citations PDF
defense arXiv Nov 24, 2025 · Nov 2025

Towards Realistic Guarantees: A Probabilistic Certificate for SmoothLLM

Adarsh Kumarappan, Ayushi Mehrotra · California Institute of Technology

Probabilistic (k,ε)-unstable certificate tightens SmoothLLM's jailbreak defense guarantees for both GCG and PAIR attacks

Input Manipulation Attack Prompt Injection nlp
1 citations PDF Code