ML Security Papers

Latest papers

5 papers

attack arXiv Jan 14, 2026 · 11w ago

Robert Dilworth · Mississippi State University

Defeats ML-based authorship attribution by injecting zero-width Unicode characters to corrupt stylometric fingerprints at 33%+ word coverage

Input Manipulation Attack nlp

attack arXiv Dec 3, 2025 · Dec 2025

Robert Dilworth · Mississippi State University

Attacks ML-based authorship attribution by obfuscating writing style via machine translation, paraphrasing, and Unicode zero-width steganography

Input Manipulation Attack nlp

1 citations PDF

attack arXiv Oct 20, 2025 · Oct 2025

Elias Hossain, Swayamjit Saha, Somshubhra Roy et al. · University of Central Florida · Mississippi State University +1 more

Attacks LLM inference by corrupting KV cache key vectors at runtime, bypassing prompt filters and causing output degradation across GPT-2 and LLaMA-2

Input Manipulation Attack nlp

2 citations PDF

defense arXiv Sep 23, 2025 · Sep 2025

Tom Pawelek, Raj Patel, Charlotte Crowell et al. · Mississippi State University · The University of Alabama

Defends agentic LLMs against prompt injection via contextual prompt whitelisting, allowing only pre-approved interaction patterns

Prompt Injection Excessive Agency nlp

4 citations 1 influentialPDF

attack arXiv Aug 19, 2025 · Aug 2025

Robert Dilworth · Mississippi State University

Proposes Unicode zero-width steganography layered with adversarial stylometry to evade ML-based authorship attribution systems

Input Manipulation Attack nlp