John Hawkins

h-index: 3 51 citations 20 papers (total)

Papers in Database (1)

defense arXiv Oct 2, 2025 · Oct 2025

Machine Learning for Detection and Analysis of Novel LLM Jailbreaks

John Hawkins, Aditya Pramar, Rodney Beard et al. · Pingla Institute · UNSW

Fine-tunes BERT to detect LLM jailbreak prompts, finding reflexivity in prompt structure as a key discriminating signal

Prompt Injection nlp
1 citations PDF