Latest papers

26 papers
defense arXiv Mar 25, 2026 · 12d ago

AMIF: Authorizable Medical Image Fusion Model with Built-in Authentication

Jie Song, Jun Jia, Wei Sun et al. · Macao Polytechnic University · Shanghai Jiao Tong University +2 more

Medical image fusion model embedding visible copyright watermarks in outputs, removable only with authentication keys

Model Theft Output Integrity Attack visionmultimodal
PDF
defense arXiv Mar 25, 2026 · 12d ago

DP^2-VL: Private Photo Dataset Protection by Data Poisoning for Vision-Language Models

Hongyi Miao, Jun Jia, Xincheng Wang et al. · Shandong University · Shanghai Jiao Tong University +4 more

Data poisoning defense that protects private photo datasets from VLM fine-tuning attacks that extract identity-affiliation relationships

Data Poisoning Attack Sensitive Information Disclosure visionnlpmultimodal
PDF
defense arXiv Mar 18, 2026 · 19d ago

Evidence Packing for Cross-Domain Image Deepfake Detection with LVLMs

Yuxin Liu, Fei Wang, Kun Li et al. · AnHui University · Hefei Comprehensive National Science Center +2 more

Training-free deepfake detection using LVLMs that mines suspicious patch tokens via semantic clustering and frequency-noise anomaly scoring

Output Integrity Attack visionmultimodal
PDF
defense arXiv Mar 11, 2026 · 26d ago

Layer Consistency Matters: Elegant Latent Transition Discrepancy for Generalizable Synthetic Image Detection

Yawen Yang, Feng Li, Shuqi Kong et al. · Hefei University of Technology

Detects AI-generated images by exploiting inter-layer latent representation inconsistencies unique to GAN/diffusion model outputs

Output Integrity Attack visiongenerative
PDF Code
tool arXiv Mar 9, 2026 · 28d ago

SWIFT: Sliding Window Reconstruction for Few-Shot Training-Free Generated Video Attribution

Chao Wang, Zijin Yang, Yaofei Wang et al. · University of Science and Technology of China · Hefei University of Technology

Few-shot, training-free video attribution tool traces generated videos to source models via sliding-window reconstruction loss signals

Output Integrity Attack visiongenerative
PDF Code
defense arXiv Mar 2, 2026 · 5w ago

Process Over Outcome: Cultivating Forensic Reasoning for Generalizable Multimodal Manipulation Detection

Yuchen Zhang, Yaxiong Wang, Kecheng Han et al. · Xi’an Jiaotong University · Hefei University of Technology +3 more

Proposes REFORM, a forensic-reasoning framework with curriculum learning and RL to generalize multimodal deepfake detection

Output Integrity Attack multimodalvisionnlpgenerative
PDF
benchmark arXiv Feb 26, 2026 · 5w ago

Devling into Adversarial Transferability on Image Classification: Review, Benchmark, and Evaluation

Xiaosen Wang, Zhijin Ge, Bohan Liu et al. · Huazhong University of Science and Technology · Xidian University +3 more

Surveys 100+ transfer-based adversarial attacks, proposes unified benchmark framework to address unfair comparisons in the field

Input Manipulation Attack vision
PDF Code
tool arXiv Feb 11, 2026 · 7w ago

OmniVL-Guard: Towards Unified Vision-Language Forgery Detection and Grounding via Balanced RL

Jinjie Shen, Jing Wu, Yaxiong Wang et al. · Hefei University of Technology · Wuhan University

Unified multimodal forgery detection and grounding system using balanced RL to handle text, image, and video fakery simultaneously

Output Integrity Attack multimodalvisionnlp
PDF Code
defense arXiv Jan 22, 2026 · 10w ago

Data-Free Privacy-Preserving for LLMs via Model Inversion and Selective Unlearning

Xinjie Zhou, Zhihui Yang, Lechao Cheng et al. · Zhejiang University · Hefei University of Technology

Defends against LLM PII memorization by inverting the model to synthesize pseudo-PII, then selectively unlearning it via LoRA

Model Inversion Attack Sensitive Information Disclosure nlp
PDF
defense arXiv Jan 13, 2026 · 11w ago

DNF: Dual-Layer Nested Fingerprinting for Large Language Model Intellectual Property Protection

Zhenhua Xu, Yiran Zhao, Mengting Zhong et al. · Zhejiang University · Binjiang Institute of Zhejiang University +3 more

Hierarchical backdoor fingerprinting embeds nested stylistic and semantic triggers in LLMs to prove ownership against black-box theft

Model Theft Model Theft nlp
3 citations PDF Code
defense arXiv Dec 14, 2025 · Dec 2025

Open-World Deepfake Attribution via Confidence-Aware Asymmetric Learning

Haiyang Zheng, Nan Pu, Wenjing Li et al. · University of Trento · Hefei University of Technology

Novel open-world deepfake attribution framework that identifies source forgery models for both known and novel synthetic face types

Output Integrity Attack vision
1 citations PDF Code
survey arXiv Dec 6, 2025 · Dec 2025

Degrading Voice: A Comprehensive Overview of Robust Voice Conversion Through Input Manipulation

Xining Song, Zhihua Wei, Rui Wang et al. · Tongji University · iFLYTEK +2 more

Surveys adversarial, noise, and perturbation attacks on voice conversion models plus defenses, evaluating robustness across four speech quality dimensions

Input Manipulation Attack audio
1 citations PDF
benchmark arXiv Nov 29, 2025 · Nov 2025

MVAD : A Comprehensive Multimodal Video-Audio Dataset for AIGC Detection

Mengxue Hu, Yunfeng Diao, Changtao Miao et al. · Hefei University of Technology · Ant Group +1 more

Introduces MVAD, the first general-purpose dataset for detecting AI-generated multimodal video-audio content across diverse generators and forgery patterns

Output Integrity Attack visionaudiomultimodalgenerative
1 citations PDF Code
defense arXiv Nov 25, 2025 · Nov 2025

Harmonious Parameter Adaptation in Continual Visual Instruction Tuning for Safety-Aligned MLLMs

Ziqi Wang, Chang Che, Qi Wang et al. · Hefei University of Technology · Tsinghua University +1 more

Defends safety alignment of multimodal LLMs against degradation during continual visual fine-tuning via orthogonal parameter adaptation

Transfer Learning Attack Prompt Injection visionnlpmultimodal
1 citations PDF
defense arXiv Nov 25, 2025 · Nov 2025

DLADiff: A Dual-Layer Defense Framework against Fine-Tuning and Zero-Shot Customization of Diffusion Models

Jun Jia, Hongyi Miao, Yingjie Zhou et al. · Shanghai Jiao Tong University · Shandong University +2 more

Defends facial images from diffusion model customization by adding dual-layer adversarial perturbations that disrupt both fine-tuning and zero-shot identity generation

Output Integrity Attack visiongenerative
PDF
defense arXiv Nov 25, 2025 · Nov 2025

Adapter Shield: A Unified Framework with Built-in Authentication for Preventing Unauthorized Zero-Shot Image-to-Image Generation

Jun Jia, Hongyi Miao, Yingjie Zhou et al. · Shandong University · Shanghai Jiao Tong University +2 more

Adversarial perturbation defense that disrupts zero-shot diffusion generation of faces and styles while permitting authenticated access via reversible embedding encryption

Input Manipulation Attack Output Integrity Attack visiongenerative
PDF
benchmark arXiv Nov 24, 2025 · Nov 2025

Evaluating Dataset Watermarking for Fine-tuning Traceability of Customized Diffusion Models: A Comprehensive Benchmark and Removal Approach

Xincheng Wang, Hanchi Sun, Wenjun Sun et al. · Donghua University · Shanghai Jiaotong University +3 more

Benchmarks dataset watermarking schemes for diffusion model traceability and proposes a removal attack that fully defeats them

Output Integrity Attack visiongenerative
PDF
defense arXiv Nov 16, 2025 · Nov 2025

FLClear: Visually Verifiable Multi-Client Watermarking for Federated Learning

Chen Gu, Yingying Sun, Yifan She et al. · Hefei University of Technology

Embeds visually verifiable, collision-free ownership watermarks in federated learning models to defend against malicious server IP theft

Model Theft federated-learning
PDF
defense arXiv Oct 29, 2025 · Oct 2025

EIRES:Training-free AI-Generated Image Detection via Edit-Induced Reconstruction Error Shift

Wan Jiang, Jing Yan, Xiaojing Chen et al. · Hefei University of Technology · AnHui University +1 more

Training-free AI-generated image detector exploiting asymmetric reconstruction error shifts induced by structural edits

Output Integrity Attack visiongenerative
1 citations PDF
attack arXiv Sep 28, 2025 · Sep 2025

Formalization Driven LLM Prompt Jailbreaking via Reinforcement Learning

Zhaoqi Wang, Daqing He, Zijian Zhang et al. · Beijing Institute of Technology · Hefei University of Technology +1 more

Attacks LLM alignment with RL-driven formalization of jailbreak prompts combined with GraphRAG knowledge reuse

Prompt Injection nlp
PDF
Loading more papers…