Latest papers

4 papers
defense arXiv Jan 27, 2026 · 9w ago

SHIELD: An Auto-Healing Agentic Defense Framework for LLM Resource Exhaustion Attacks

Nirhoshan Sivaroopan, Kanchana Thilakarathna, Albert Zomaya et al. · University of New South Wales · University of Wollongong

Multi-agent auto-healing defense framework that detects and adapts to sponge attacks exhausting LLM compute resources

Model Denial of Service nlp
PDF
benchmark arXiv Dec 29, 2025 · Dec 2025

Prompt-Induced Over-Generation as Denial-of-Service: A Black-Box Attack-Side Benchmark

Manu, Yi Guo, Kanchana Thilakarathna et al. · The University of Sydney · University of New South Wales +1 more

Benchmarks black-box LLM DoS attacks using evolutionary and RL-based prompt search to suppress EOS and inflate output length

Model Denial of Service nlp
1 citations 1 influentialPDF
attack arXiv Oct 14, 2025 · Oct 2025

RAID: Refusal-Aware and Integrated Decoding for Jailbreaking LLMs

Tuan T. Nguyen, John Le, Thai T. Vu et al. · VNPT AI · University of Wollongong

Embedding-space adversarial suffix attack steers LLM activations away from refusal directions to achieve jailbreaks with fewer queries

Input Manipulation Attack Prompt Injection nlp
PDF
defense arXiv Aug 1, 2025 · Aug 2025

FedGuard: A Diverse-Byzantine-Robust Mechanism for Federated Learning with Major Malicious Clients

Haocheng Jiang, Hua Shen, Jixin Zhang et al. · Hubei University of Technology · University of Wollongong

Defends federated learning against 90% malicious Byzantine clients using membership inference sensitivity to detect poisoned model updates

Data Poisoning Attack federated-learning
PDF