ML Security Papers

Latest papers

2 papers

benchmark arXiv Feb 15, 2026 · 7w ago

Max Fomin · Zenity

LODO evaluation exposes 8.4pp AUC inflation in prompt injection classifiers and reveals production guardrails miss 63–93% of indirect attacks

Prompt Injection nlp

attack arXiv Feb 2, 2026 · 9w ago

Tomer Kordonsky, Maayan Yamin, Noam Benzimra et al. · Technion -- Israel Institute of Technology · Zenity

Exploits LLM code-generation template recurrence to predict hidden backend vulnerabilities from observable frontend features in a black-box attack

Sensitive Information Disclosure nlp