ML Security Papers

ML Security Papers

Latest papers

1 papers

defense Consumer Communications and Ne... Nov 14, 2025 · Nov 2025

NegBLEURT Forest: Leveraging Inconsistencies for Detecting Jailbreak Attacks

Lama Sleem, Jerome Francois, Lujun Li et al. · University of Luxembourg · Institut National Polytechnique de Toulouse +1 more

Detects LLM jailbreaks via negation-aware BLEURT scoring and Isolation Forest anomaly detection without threshold tuning

Prompt Injection nlp