Latest papers

4 papers
defense arXiv Mar 3, 2026 · 4w ago

SIGMark: Scalable In-Generation Watermark with Blind Extraction for Video Diffusion

Xinjie Zhu, Zijing Zhao, Hui Jin et al. · Lenovo

Embeds scalable blind-extractable watermarks into AI-generated videos during diffusion sampling for robust content provenance tracing

Output Integrity Attack visiongenerative
PDF Code
defense arXiv Jan 20, 2026 · 10w ago

Activation-Space Anchored Access Control for Multi-Class Permission Reasoning in Large Language Models

Zhaopeng Zhang, Pengcheng Sun, Lan Zhang et al. · University of Science and Technology of China · Lenovo

Defends LLMs over knowledge bases from unauthorized data leakage using training-free activation steering to enforce multi-class permissions

Sensitive Information Disclosure Prompt Injection nlp
PDF
defense ACM MM Dec 29, 2025 · Dec 2025

PurifyGen: A Risk-Discrimination and Semantic-Purification Model for Safe Text-to-Image Generation

Zongsheng Cao, Yangfan He, Anran Liu et al. · University of Minnesota · Lenovo

Training-free prompt purification removes toxic semantic embeddings in T2I diffusion models to prevent unsafe image generation

Prompt Injection generativemultimodal
3 citations 1 influentialPDF Code
attack arXiv Aug 5, 2025 · Aug 2025

Untraceable DeepFakes via Traceable Fingerprint Elimination

Jiewei Lai, Lan Zhang, Chen Tang et al. · University of Science and Technology of China · Lenovo

Multiplicative attack eliminates generative model fingerprints from DeepFakes, defeating attribution forensics with 97% average success rate

Output Integrity Attack visiongenerative
PDF