Latest papers

1 papers
benchmark arXiv Nov 24, 2025 · Nov 2025

Automating Deception: Scalable Multi-Turn LLM Jailbreaks

Adarsh Kumarappan, Ananya Mujoo · California Institute of Technology · Evergreen Valley College

Automated pipeline generating 1,500 psychologically-grounded multi-turn FITD jailbreaks; GPT family shows 32pp ASR increase with conversational history

Prompt Injection nlp
2 citations PDF