Latest papers

11 papers
attack arXiv Feb 28, 2026 · 5w ago

IU: Imperceptible Universal Backdoor Attack

Hsin Lin, Yan-Lun Chen, Ren-Hung Hwang et al. · National Yang Ming Chiao Tung University

GCN-generated imperceptible triggers inject universal multi-target backdoors into ResNets at 0.16% poison rate with 91.3% ASR

Model Poisoning vision
PDF
benchmark arXiv Jan 19, 2026 · 11w ago

OI-Bench: An Option Injection Benchmark for Evaluating LLM Susceptibility to Directive Interference

Yow-Fu Liou, Yu-Chien Tang, Yu-Hsiang Liu et al. · National Yang Ming Chiao Tung University

Benchmarks 12 LLMs against directive-laced MCQA option injections, revealing widespread susceptibility to social compliance, threat, and framing attacks

Prompt Injection nlp
PDF
benchmark arXiv Dec 11, 2025 · Dec 2025

TriDF: Evaluating Perception, Detection, and Hallucination for Interpretable DeepFake Detection

Jian-Yu Jiang-Lin, Kang-Yang Huang, Ling Zou et al. · National Taiwan University · National Yang Ming Chiao Tung University +1 more

Benchmark for evaluating MLLMs on interpretable deepfake detection across perception, detection, and hallucination dimensions

Output Integrity Attack visionaudiomultimodalnlp
PDF
defense arXiv Dec 8, 2025 · Dec 2025

Towards Robust DeepFake Detection under Unstable Face Sequences: Adaptive Sparse Graph Embedding with Order-Free Representation and Explicit Laplacian Spectral Prior

Chih-Chung Hsu, Shao-Ning Chen, Chia-Ming Lee et al. · National Yang Ming Chiao Tung University · National Cheng Kung University

Robust deepfake detector using graph Laplacian spectral priors that handles missing, shuffled, or adversarially disrupted face sequences

Output Integrity Attack vision
PDF
defense International Journal of Compu... Nov 24, 2025 · Nov 2025

UMCL: Unimodal-generated Multimodal Contrastive Learning for Cross-compression-rate Deepfake Detection

Ching-Yi Lai, Chih-Yu Jian, Pei-Cheng Chuang et al. · National Tsing Hua University · National Cheng Kung University +1 more

Deepfake detector using unimodal-to-multimodal contrastive learning for robust detection across social media compression rates

Output Integrity Attack visionmultimodal
PDF
defense arXiv Nov 14, 2025 · Nov 2025

Defending Unauthorized Model Merging via Dual-Stage Weight Protection

Wei-Jia Chen, Min-Yen Tsai, Cheng-Yi Lee et al. · National Yang Ming Chiao Tung University · Academia Sinica

Protects model IP from unauthorized merging via dual-stage weight perturbation that causes destructive interference in merged models

Model Theft visionnlp
PDF
attack Journal of Network and Compute... Oct 11, 2025 · Oct 2025

ArtPerception: ASCII Art-based Jailbreak on LLMs with Recognition Pre-test

Guan-Yan Yang, Tzu-Yu Cheng, Ya-Wen Teng et al. · National Taiwan University · GARMIN +2 more

Two-phase black-box jailbreak uses ASCII art encoding to bypass LLM safety alignment, including GPT-4o and Claude Sonnet 3.7

Prompt Injection nlp
2 citations PDF
attack arXiv Oct 2, 2025 · Oct 2025

StealthAttack: Robust 3D Gaussian Splatting Poisoning via Density-Guided Illusions

Bo-Hsu Ke, You-Zhe Xie, Yu-Lun Liu et al. · National Yang Ming Chiao Tung University

Poisons 3D Gaussian Splatting training images to embed viewpoint-dependent illusions invisible to innocent views

Data Poisoning Attack vision
2 citations PDF Code
defense arXiv Sep 3, 2025 · Sep 2025

Enhancing Robustness in Post-Processing Watermarking: An Ensemble Attack Network Using CNNs and Transformers

Tzuhsuan Huang, Cheng Yu Yeo, Tsai-Ling Huang et al. · Academia Sinica · National Yang Ming Chiao Tung University +1 more

Adversarial training with CNN+Transformer ensemble attack networks makes post-processing image watermarks robust against regeneration and distortion attacks

Output Integrity Attack visiongenerative
PDF Code
attack arXiv Aug 17, 2025 · Aug 2025

Adversarial Attacks on VQA-NLE: Exposing and Alleviating Inconsistencies in Visual Question Answering Explanations

Yahsin Yeh, Yilun Wu, Bokai Ruan et al. · National Yang Ming Chiao Tung University

Adversarial image and question perturbation attacks expose inconsistency vulnerabilities in VQA-NLE models, with knowledge-based mitigation proposed

Input Manipulation Attack Prompt Injection visionnlpmultimodal
PDF
attack arXiv Jan 4, 2025 · Jan 2025

BADTV: Unveiling Backdoor Threats in Third-Party Task Vectors

Chia-Yi Hsu, Yu-Lin Tsai, Yu Zhe et al. · National Yang Ming Chiao Tung University · University of Tsukuba +2 more

Backdoor attack on task vectors that persists across task learning, forgetting, and analogy arithmetic operations, evading all tested defenses

Model Poisoning Transfer Learning Attack visionnlpmultimodal
2 citations PDF