ML Security Papers

Latest papers

4 papers

defense arXiv Feb 4, 2026 · 8w ago

Wenting Li, Saif R. Kazi, Russell Bent et al. · University of Texas at Austin · Los Alamos National Laboratory +1 more

Branch-and-bound neural network verifier using NLP-CC upper bounds to certify or disprove adversarial robustness more efficiently than MIP methods

Input Manipulation Attack vision

defense IACR ePrint Dec 9, 2025 · Dec 2025

Miranda Christ, Noah Golowich, Sam Gunn et al. · Columbia University · Microsoft Research +5 more

Constructs provably robust LLM watermarks with subexponential security, surviving worst-case edits and detection-key-aware adversaries

Output Integrity Attack nlp

attack arXiv Oct 12, 2025 · Oct 2025

Mohan Zhang, Yihua Zhang, Jinghan Jia et al. · University of North Carolina at Chapel Hill · Michigan State University +1 more

Backdoor-implanted attack on large reasoning models forcing perpetual CoT loops, achieving 100% resource exhaustion success rate

Model Poisoning Model Denial of Service nlp

1 citations PDF

benchmark arXiv Aug 9, 2025 · Aug 2025

Ishwar Balappanawar, Venkata Hasith Vattikuti, Greta Kintzley et al. · IIIT Hyderabad · University of Texas at Austin +1 more

Adversarial auditing game framework detects backdoored CNNs and misaligned LLMs using model diffing, gradients, and adversarial probing

Model Poisoning Prompt Injection visionnlp