Latest papers

5 papers
attack arXiv Mar 30, 2026 · 7d ago

\texttt{ReproMIA}: A Comprehensive Analysis of Model Reprogramming for Proactive Membership Inference Attacks

Chihan Huang, Huaijin Wang, Shuai Wang · HKUST

Novel membership inference attack using model reprogramming to amplify privacy leakage signals across LLMs, diffusion models, and classifiers

Membership Inference Attack nlpvisiongenerative
PDF
attack arXiv Feb 3, 2026 · 8w ago

Controlling Output Rankings in Generative Engines for LLM-based Search

Haibo Jin, Ruoxi Chen, Peiyan Zhang et al. · University of Illinois at Urbana-Champaign · Starc Institute +2 more

Injects crafted content into product pages to manipulate LLM-based search rankings with 91% promotion success rate

Input Manipulation Attack Prompt Injection nlp
PDF
defense arXiv Nov 27, 2025 · Nov 2025

Rethinking Cross-Generator Image Forgery Detection through DINOv3

Zhenglin Huang, Jason Li, Haiquan Wen et al. · University of Liverpool · Nanyang Technological University +3 more

Discovers frozen DINOv3 detects cross-generator image forgeries via low-frequency cues; proposes training-free token-ranking baseline

Output Integrity Attack visiongenerative
PDF
attack EMNLP Oct 4, 2025 · Oct 2025

Backdoor-Powered Prompt Injection Attacks Nullify Defense Methods

Yulin Chen, Haoran Li, Yuan Sui et al. · National University of Singapore · HKUST

Backdoor injected via SFT data poisoning makes LLMs execute injected instructions, defeating instruction hierarchy prompt injection defenses

Model Poisoning Prompt Injection nlp
1 citations PDF Code
benchmark arXiv Aug 17, 2025 · Aug 2025

MCPSecBench: A Systematic Security Benchmark and Playground for Testing Model Context Protocols

Yixuan Yang, Cuifeng Gao, Daoyuan Wu et al. · Eurecom · Lingnan University +2 more

Benchmarks MCP security across Claude, OpenAI, and Cursor, uncovering 17 attack types with existing defenses below 30% effectiveness

Insecure Plugin Design Prompt Injection nlp
PDF