ML Security Papers

Latest papers

2 papers

attack arXiv Oct 29, 2025 · Oct 2025

André V. Duarte, Xuying li, Bin Zeng et al. · Carnegie Mellon University · Instituto Superior Técnico +1 more

Agentic feedback-loop pipeline extracts memorized copyrighted books from LLMs, improving ROUGE-L by 24% over single-pass extraction

Model Inversion Attack Sensitive Information Disclosure nlp

benchmark arXiv Aug 30, 2025 · Aug 2025

Yuting Tan, Xuying Li, Zhuo Li et al. · HydroX AI

Systematic appraisal of GCG adversarial suffix attacks on LLMs revealing evaluation overestimation and coding prompt vulnerability

Input Manipulation Attack Prompt Injection nlp