Latest papers

12 papers
defense arXiv Feb 5, 2026 · 8w ago

Detecting Misbehaviors of Large Vision-Language Models by Evidential Uncertainty Quantification

Tao Huang, Rui Wang, Xiaofei Liu et al. · State Key Laboratory of Advanced Rail Autonomous Operation · Beijing Key Laboratory of Traffic Data Mining and Embodied Intelligence +2 more

Training-free uncertainty decomposition detects jailbreaks, adversarial inputs, hallucinations, and OOD failures in vision-language models

Input Manipulation Attack Prompt Injection visionnlpmultimodal
PDF Code
defense TPAMI Jan 17, 2026 · 11w ago

A Unified Masked Jigsaw Puzzle Framework for Vision and Language Models

Weixin Ye, Wei Wang, Yahui Liu et al. · Beijing Jiaotong University · Kuaishou +4 more

Defends against gradient inversion in federated Transformers by shuffling tokens and masking position embeddings

Model Inversion Attack visionnlpfederated-learning
PDF Code
defense USENIX Security Dec 17, 2025 · Dec 2025

From Risk to Resilience: Towards Assessing and Mitigating the Risk of Data Reconstruction Attacks in Federated Learning

Xiangrui Xu, Zhize Li, Yufei Han et al. · Beijing Jiaotong University · Singapore Management University +3 more

Theoretical framework quantifying data reconstruction attack risk in federated learning via Jacobian spectral analysis, with adaptive noise defenses

Model Inversion Attack federated-learningvision
1 citations PDF
defense arXiv Dec 15, 2025 · Dec 2025

Scaling Up AI-Generated Image Detection via Generator-Aware Prototypes

Ziheng Qin, Yuheng Ji, Renshuai Tao et al. · University of Chinese Academy of Sciences · Institute of Automation +1 more

Proposes prototype-based framework to detect AI-generated images across GAN and diffusion generators at scale

Output Integrity Attack visiongenerative
1 citations PDF Code
defense arXiv Nov 26, 2025 · Nov 2025

Self-Guided Defense: Adaptive Safety Alignment for Reasoning Models via Synthesized Guidelines

Yuhang Wang, Yanxu Zhu, Dongyuan Lu et al. · Beijing Jiaotong University · University of International Business and Economics

Defends reasoning LLMs against jailbreaks by synthesizing safety guidelines and fine-tuning with SFT and DPO for adaptive alignment

Prompt Injection nlp
PDF
benchmark arXiv Oct 11, 2025 · Oct 2025

Semantic Visual Anomaly Detection and Reasoning in AI-Generated Images

Chuangchuang Tan, Xiang Ming, Jinglu Wang et al. · Beijing Jiaotong University · Microsoft Research Asia +1 more

Benchmark and evaluation framework for detecting semantic anomalies in AI-generated images, targeting deepfake detection and AIGC authenticity

Output Integrity Attack visionmultimodal
PDF
defense arXiv Aug 13, 2025 · Aug 2025

Leveraging Failed Samples: A Few-Shot and Training-Free Framework for Generalized Deepfake Detection

Shibo Yao, Renshuai Tao, Xiaolong Zheng et al. · Beijing Jiaotong University · Chinese Academy of Sciences +1 more

Training-free few-shot deepfake detector using nearest-neighbor classification, evaluated across 29 generative models

Output Integrity Attack visiongenerative
PDF
defense arXiv Aug 12, 2025 · Aug 2025

Leveraging Unlabeled Data from Unknown Sources via Dual-Path Guidance for Deepfake Face Detection

Zhiqiang Yang, Renshuai Tao, Chunjie Zhang et al. · Beijing Jiaotong University · Chinese Academy of Sciences

Proposes dual-path network combining CLIP-based domain alignment and pseudo-labeling to detect deepfakes from unseen generative sources

Output Integrity Attack vision
PDF
attack KSEM Aug 9, 2025 · Aug 2025

Label Inference Attacks against Federated Unlearning

Wei Wang, Xiangyun Tang, Yajie Wang et al. · Minzu University of China · Beijing Institute of Technology +3 more

Attacks federated unlearning systems by inferring private data labels from model parameter variations using gradient-label mapping

Model Inversion Attack federated-learning
PDF
attack arXiv Aug 8, 2025 · Aug 2025

SAM Encoder Breach by Adversarial Simplicial Complex Triggers Downstream Model Failures

Yi Qin, Rui Wang, Tao Huang et al. · Beijing Jiaotong University · National Engineering Research Center of Rail Transportation Operation and Control System +2 more

Adversarial attack on SAM's encoder using simplicial complex geometry to craft highly transferable examples that break downstream vision models

Input Manipulation Attack vision
PDF
defense arXiv Aug 2, 2025 · Aug 2025

ForenX: Towards Explainable AI-Generated Image Detection with Multimodal Large Language Models

Chuangchuang Tan, Jinglu Wang, Xiang Ming et al. · Beijing Jiaotong University · Microsoft Research Asia

Explainable AI-generated image detection via MLLM forensic prompts, plus a new forgery-evidence description dataset

Output Integrity Attack visionmultimodalnlp
PDF
defense in IEEE Transactions on Depend... Jan 9, 2025 · Jan 2025

TAPFed: Threshold Secure Aggregation for Privacy-Preserving Federated Learning

Runhua Xu, Bo Li, Chao Li et al. · Beihang University · Zhongguancun Laboratory +2 more

Defends FL training data against gradient inference attacks using threshold functional encryption tolerating malicious aggregators

Model Inversion Attack federated-learning
PDF