ML Security Papers

Latest papers

2 papers

benchmark arXiv Dec 31, 2025 · Dec 2025

Muhammad Abdullahi Said, Muhammad Sammani Sani · African Institute for Mathematical Sciences · University of Vienna

Audits LLM safety across Hausa/English and temporal frames, revealing past-tense framing bypasses defenses with only 15.6% safe responses

Prompt Injection nlp

defense arXiv Oct 20, 2025 · Oct 2025

Asim Mohamed, Martin Gubri · African Institute for Mathematical Sciences · Parameter Lab

Defends LLM text watermarks against translation attacks in low-resource languages via back-translation detection (STEAM)

Output Integrity Attack nlp