Latest papers

6 papers
attack arXiv Mar 10, 2026 · 27d ago

Compatibility at a Cost: Systematic Discovery and Exploitation of MCP Clause-Compliance Vulnerabilities

Nanzi Yang, Weiheng Bai, Kangjie Lu · University of Minnesota

Systematically exploits MCP SDK non-compliance vulnerabilities to launch silent prompt injection and DoS attacks against LLM agents

Insecure Plugin Design Prompt Injection nlp
PDF
defense ACM MM Dec 29, 2025 · Dec 2025

PurifyGen: A Risk-Discrimination and Semantic-Purification Model for Safe Text-to-Image Generation

Zongsheng Cao, Yangfan He, Anran Liu et al. · University of Minnesota · Lenovo

Training-free prompt purification removes toxic semantic embeddings in T2I diffusion models to prevent unsafe image generation

Prompt Injection generativemultimodal
3 citations 1 influentialPDF Code
benchmark arXiv Nov 7, 2025 · Nov 2025

Leak@$k$: Unlearning Does Not Make LLMs Forget Under Probabilistic Decoding

Hadi Reisizadeh, Jiajun Ruan, Yiwei Chen et al. · University of Minnesota · Michigan State University +1 more

Exposes that all major LLM unlearning methods still leak private/hazardous training data under probabilistic sampling; introduces leak@k metric and RULE defense.

Model Inversion Attack Sensitive Information Disclosure nlp
1 citations PDF
defense arXiv Oct 2, 2025 · Oct 2025

Detecting Post-generation Edits to Watermarked LLM Outputs via Combinatorial Watermarking

Liyan Xie, Muhammad Siddeek, Mohamed Seif et al. · University of Minnesota · Princeton University +2 more

Combinatorial vocabulary-partitioning watermark for LLM text that detects and localizes post-generation edits and spoofing attacks

Output Integrity Attack nlp
1 citations PDF
defense arXiv Sep 30, 2025 · Sep 2025

CODED-SMOOTHING: Coding Theory Helps Generalization

Parsa Moradi, Tayyebeh Jahaninezhad, Mohammad Ali Maddah-Ali · University of Minnesota · Technical University Berlin

Regularization module from coded computing theory that defends against gradient-based adversarial attacks by enforcing smooth model representations

Input Manipulation Attack vision
PDF
defense arXiv Jan 3, 2025 · Jan 2025

Training-Free Defense Against Adversarial Attacks in Deep Learning MRI Reconstruction

Mahdi Saberi, Chi Zhang, Mehmet Akçakaya · University of Minnesota

Training-free adversarial defense for MRI reconstruction models using cyclic measurement consistency, outperforming retrained baselines

Input Manipulation Attack vision
1 citations PDF