ML Security Papers

Latest papers

4 papers

defense arXiv Mar 3, 2026 · 4w ago

Xinjie Zhu, Zijing Zhao, Hui Jin et al. · Lenovo

Embeds scalable blind-extractable watermarks into AI-generated videos during diffusion sampling for robust content provenance tracing

Output Integrity Attack visiongenerative

defense arXiv Jan 20, 2026 · 10w ago

Zhaopeng Zhang, Pengcheng Sun, Lan Zhang et al. · University of Science and Technology of China · Lenovo

Defends LLMs over knowledge bases from unauthorized data leakage using training-free activation steering to enforce multi-class permissions

Sensitive Information Disclosure Prompt Injection nlp

defense ACM MM Dec 29, 2025 · Dec 2025

Zongsheng Cao, Yangfan He, Anran Liu et al. · University of Minnesota · Lenovo

Training-free prompt purification removes toxic semantic embeddings in T2I diffusion models to prevent unsafe image generation

Prompt Injection generativemultimodal

3 citations 1 influentialPDF Code

attack arXiv Aug 5, 2025 · Aug 2025

Jiewei Lai, Lan Zhang, Chen Tang et al. · University of Science and Technology of China · Lenovo

Multiplicative attack eliminates generative model fingerprints from DeepFakes, defeating attribution forensics with 97% average success rate

Output Integrity Attack visiongenerative