Latest papers

2 papers
benchmark arXiv Dec 31, 2025 · Dec 2025

Safe in the Future, Dangerous in the Past: Dissecting Temporal and Linguistic Vulnerabilities in LLMs

Muhammad Abdullahi Said, Muhammad Sammani Sani · African Institute for Mathematical Sciences · University of Vienna

Audits LLM safety across Hausa/English and temporal frames, revealing past-tense framing bypasses defenses with only 15.6% safe responses

Prompt Injection nlp
PDF Code
defense arXiv Oct 20, 2025 · Oct 2025

Is Multilingual LLM Watermarking Truly Multilingual? A Simple Back-Translation Solution

Asim Mohamed, Martin Gubri · African Institute for Mathematical Sciences · Parameter Lab

Defends LLM text watermarks against translation attacks in low-resource languages via back-translation detection (STEAM)

Output Integrity Attack nlp
PDF