Latest papers

3 papers
attack arXiv Mar 14, 2026 · 23d ago

Sirens' Whisper: Inaudible Near-Ultrasonic Jailbreaks of Speech-Driven LLMs

Zijian Ling, Pingyi Hu, Xiuyong Gao et al. · Huazhong University of Science and Technology · Tsinghua University +1 more

Inaudible near-ultrasonic acoustic channel attack that delivers jailbreak prompts to speech-driven LLMs through commodity hardware

Input Manipulation Attack Prompt Injection nlpaudiomultimodal
PDF
defense arXiv Oct 6, 2025 · Oct 2025

Unified Threat Detection and Mitigation Framework (UTDMF): Combating Prompt Injection, Deception, and Bias in Enterprise-Scale Transformers

Santhosh KumarRavindran · Microsoft Corporation

Activation-patching framework detecting and mitigating prompt injection, deception, and bias in enterprise LLMs with 92% injection detection accuracy

Prompt Injection Excessive Agency nlp
PDF
attack arXiv Aug 8, 2025 · Aug 2025

Fact2Fiction: Targeted Poisoning Attack to Agentic Fact-checking System

Haorui He, Yupeng Li, Bin Benjamin Zhu et al. · Hong Kong Baptist University · The University of Hong Kong +1 more

Poisons RAG knowledge bases of LLM fact-checkers by mimicking claim decomposition and exploiting justifications to craft targeted malicious evidence

Data Poisoning Attack Prompt Injection nlp
PDF Code