Latest papers

3 papers
survey arXiv Sep 30, 2025 · Sep 2025

Secure and Robust Watermarking for AI-generated Images: A Comprehensive Survey

Jie Cao, Qi Li, Zelin Zhang et al. · Queen’s University

Surveys secure and robust watermarking techniques for AI-generated images, covering methods, evaluation, and attack vulnerabilities

Output Integrity Attack visiongenerative
1 citations PDF
defense arXiv Aug 27, 2025 · Aug 2025

Robustness Assessment and Enhancement of Text Watermarking for Google's SynthID

Xia Han, Qi Li, Jianbing Ni et al. · Queen’s University

Exposes SynthID-Text watermark fragility under paraphrasing attacks, then proposes SynGuard hybrid defense improving F1 by 11.1%

Output Integrity Attack nlp
PDF Code
defense arXiv Aug 21, 2025 · Aug 2025

SafeLLM: Unlearning Harmful Outputs from Large Language Models against Jailbreak Attacks

Xiangman Li, Xiaodong Wu, Qi Li et al. · Queen’s University

Defends LLMs against jailbreak attacks via token-level FFN unlearning that irreversibly removes harmful knowledge pathways

Prompt Injection nlp
PDF