Cong Wang

Papers in Database (5)

tool arXiv Jan 6, 2025 · Jan 2025

CALM: Curiosity-Driven Auditing for Large Language Models

Xiang Zheng, Longxiang Wang, Yi Liu et al. · City University of Hong Kong · Fudan University +1 more

RL-based auditing tool that automatically discovers black-box LLM prompts eliciting toxic or politically sensitive outputs

Prompt Injection nlp
PDF Code
attack arXiv Mar 18, 2026 · 19d ago

ARES: Scalable and Practical Gradient Inversion Attack in Federated Learning through Activation Recovery

Zirui Gong, Leo Yu Zhang, Yanjun Zhang et al. · Griffith University · Swinburne University of Technology +2 more

Gradient inversion attack reconstructing training data from federated learning updates via sparse activation recovery without architectural changes

Model Inversion Attack visionfederated-learning
PDF
defense arXiv Aug 8, 2025 · Aug 2025

Quantifying Conversation Drift in MCP via Latent Polytope

Haoran Shi, Hongwei Yao, Shuo Shao et al. · arXiv · Zhejiang University +3 more

Defends LLM-MCP tool integrations against indirect prompt injection by detecting adversarial conversation drift in latent polytope space

Insecure Plugin Design Prompt Injection nlp
PDF
benchmark arXiv Aug 1, 2025 · Aug 2025

Revisiting Adversarial Patch Defenses on Object Detectors: Unified Evaluation, Large-Scale Dataset, and New Insights

Junhao Zheng, Jiahao Sun, Chenhao Lin et al. · Xi’an Jiaotong University · City University of Hong Kong +1 more

First unified benchmark evaluating 11 patch defenses against 13 adversarial patch attacks on object detectors with 94K-image dataset

Input Manipulation Attack vision
PDF Code
defense arXiv Aug 1, 2025 · Aug 2025

D3: Training-Free AI-Generated Video Detection Using Second-Order Features

Chende Zheng, Ruiqi suo, Chenhao Lin et al. · Xi’an Jiaotong University · Ltd. +1 more

Training-free AI-generated video detector exploiting second-order temporal feature divergence between real and synthetic videos

Output Integrity Attack visiongenerative
PDF Code