Cong Wang

tool arXiv Jan 6, 2025 · Jan 2025

Xiang Zheng, Longxiang Wang, Yi Liu et al. · City University of Hong Kong · Fudan University +1 more

RL-based auditing tool that automatically discovers black-box LLM prompts eliciting toxic or politically sensitive outputs

Prompt Injection nlp

attack arXiv Mar 18, 2026 · 19d ago

Zirui Gong, Leo Yu Zhang, Yanjun Zhang et al. · Griffith University · Swinburne University of Technology +2 more

Gradient inversion attack reconstructing training data from federated learning updates via sparse activation recovery without architectural changes

Model Inversion Attack visionfederated-learning

defense arXiv Aug 8, 2025 · Aug 2025

Haoran Shi, Hongwei Yao, Shuo Shao et al. · arXiv · Zhejiang University +3 more

Defends LLM-MCP tool integrations against indirect prompt injection by detecting adversarial conversation drift in latent polytope space

Insecure Plugin Design Prompt Injection nlp

benchmark arXiv Aug 1, 2025 · Aug 2025

Junhao Zheng, Jiahao Sun, Chenhao Lin et al. · Xi’an Jiaotong University · City University of Hong Kong +1 more

First unified benchmark evaluating 11 patch defenses against 13 adversarial patch attacks on object detectors with 94K-image dataset

Input Manipulation Attack vision

defense arXiv Aug 1, 2025 · Aug 2025

Chende Zheng, Ruiqi suo, Chenhao Lin et al. · Xi’an Jiaotong University · Ltd. +1 more

Training-free AI-generated video detector exploiting second-order temporal feature divergence between real and synthetic videos

Output Integrity Attack visiongenerative

Papers in Database (5)