Rohitash Chandra

h-index: 4 40 citations 21 papers (total)

Papers in Database (1)

defense arXiv Oct 2, 2025 · Oct 2025

Machine Learning for Detection and Analysis of Novel LLM Jailbreaks

John Hawkins, Aditya Pramar, Rodney Beard et al. · Pingla Institute · UNSW

Fine-tunes BERT to detect LLM jailbreak prompts, finding reflexivity in prompt structure as a key discriminating signal

Prompt Injection nlp
1 citations PDF