Latest papers

5 papers
defense arXiv Feb 27, 2026 · 5w ago

A Difference-in-Difference Approach to Detecting AI-Generated Images

Xinyi Qi, Kai Ye, Chengchun Shi et al. · Tsinghua University · The London School of Economics and Political Science +1 more

Proposes second-order reconstruction error differences to detect diffusion-model-generated images with improved generalization

Output Integrity Attack visiongenerative
PDF
defense arXiv Jan 29, 2026 · 9w ago

Learn-to-Distance: Distance Learning for Detecting LLM-Generated Text

Hongyi Zhou, Jin Zhu, Kai Ye et al. · Tsinghua University · University of Birmingham +1 more

Adaptive distance-learning algorithm for detecting LLM-generated text outperforms baselines by up to 80.6% across GPT, Claude, and Gemini

Output Integrity Attack nlp
2 citations PDF
tool arXiv Jan 10, 2026 · 12w ago

Detecting LLM-Generated Text with Performance Guarantees

Hongyi Zhou, Jin Zhu, Ying Yang et al. · Tsinghua University · University of Birmingham +1 more

Proposes LLM-generated text detector with statistical inference guarantees, outperforming watermark-free and ML-based baselines

Output Integrity Attack nlp
3 citations PDF Code
defense arXiv Dec 8, 2025 · Dec 2025

Towards Robust Protective Perturbation against DeepFake Face Swapping

Hengyang Yao, Lin Li, Ke Sun et al. · University of Birmingham · University of Oxford +2 more

Defends faces against deepfake swapping using RL-learned robust adversarial perturbations, outperforming EOT baselines by 26%

Output Integrity Attack visiongenerative
PDF
defense arXiv Sep 29, 2025 · Sep 2025

AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees

Hongyi Zhou, Jin Zhu, Pingfan Su et al. · Tsinghua University · London School of Economics and Political Science +2 more

Adaptive LLM-text detector learns a witness function from training data, improving state-of-the-art AUC by up to 37%

Output Integrity Attack nlp
5 citations 1 influentialPDF Code