ML Security Papers

Latest papers

4 papers

defense arXiv Jan 27, 2026 · 9w ago

Nirhoshan Sivaroopan, Kanchana Thilakarathna, Albert Zomaya et al. · University of New South Wales · University of Wollongong

Multi-agent auto-healing defense framework that detects and adapts to sponge attacks exhausting LLM compute resources

Model Denial of Service nlp

benchmark arXiv Dec 29, 2025 · Dec 2025

Manu, Yi Guo, Kanchana Thilakarathna et al. · The University of Sydney · University of New South Wales +1 more

Benchmarks black-box LLM DoS attacks using evolutionary and RL-based prompt search to suppress EOS and inflate output length

Model Denial of Service nlp

1 citations 1 influentialPDF

attack arXiv Oct 14, 2025 · Oct 2025

Tuan T. Nguyen, John Le, Thai T. Vu et al. · VNPT AI · University of Wollongong

Embedding-space adversarial suffix attack steers LLM activations away from refusal directions to achieve jailbreaks with fewer queries

Input Manipulation Attack Prompt Injection nlp

defense arXiv Aug 1, 2025 · Aug 2025

Haocheng Jiang, Hua Shen, Jixin Zhang et al. · Hubei University of Technology · University of Wollongong

Defends federated learning against 90% malicious Byzantine clients using membership inference sensitivity to detect poisoned model updates

Data Poisoning Attack federated-learning