Latest papers

6 papers
attack arXiv Mar 30, 2026 · 7d ago

InkDrop: Invisible Backdoor Attacks Against Dataset Condensation

He Yang, Dongyi Lv, Song Ma et al. · Xi'an Jiaotong University · Tsinghua University

Stealthy backdoor attack on dataset condensation using boundary-proximate samples and imperceptible perturbations to evade detection

Model Poisoning vision
PDF Code
tool arXiv Mar 19, 2026 · 18d ago

MedForge: Interpretable Medical Deepfake Detection via Forgery-aware Reasoning

Zhihui Chen, Kai He, Qingyuan Lei et al. · National University of Singapore · The Chinese University of Hong Kong +3 more

Detects medical image deepfakes via localize-then-analyze reasoning with expert-aligned explanations on synthetic lesion edits

Output Integrity Attack visionmultimodal
PDF Code
benchmark arXiv Feb 14, 2026 · 7w ago

DWBench: Holistic Evaluation of Watermark for Dataset Copyright Auditing

Xiao Ren, Xinyi Yu, Linkang Du et al. · Zhejiang University · Xi'an Jiaotong University +1 more

Benchmarks 25 dataset watermarking methods for copyright auditing across classification and generation tasks with new evaluation metrics

Output Integrity Attack vision
PDF
defense arXiv Feb 2, 2026 · 9w ago

$\textbf{AGT$^{AO}$}$: Robust and Stabilized LLM Unlearning via Adversarial Gating Training with Adaptive Orthogonality

Pengyu Li, Lingling Zhang, Zhitao Gao et al. · Xi'an Jiaotong University · Shaanxi Province Key Laboratory of Big Data Knowledge Engineering

Defends LLMs against adversarial recovery of memorized sensitive data via min-max gating and gradient orthogonality during unlearning

Model Inversion Attack Sensitive Information Disclosure nlp
PDF Code
benchmark arXiv Feb 1, 2026 · 9w ago

Statistical MIA: Rethinking Membership Inference Attack for Reliable Unlearning Auditing

Jialong Sun, Zeming Wei, Jiaxuan Zou et al. · Shenzhen University of Advanced Technology · Peking University +2 more

Proposes statistical MIA framework that uses distribution tests instead of shadow models to reliably audit machine unlearning with confidence intervals

Membership Inference Attack vision
PDF
defense arXiv Oct 29, 2025 · Oct 2025

EIRES:Training-free AI-Generated Image Detection via Edit-Induced Reconstruction Error Shift

Wan Jiang, Jing Yan, Xiaojing Chen et al. · Hefei University of Technology · AnHui University +1 more

Training-free AI-generated image detector exploiting asymmetric reconstruction error shifts induced by structural edits

Output Integrity Attack visiongenerative
1 citations PDF