Binghui Wang

attack arXiv Mar 3, 2026 · 4w ago

Romina Omidi, Yun Dong, Binghui Wang · Illinois Institute of Technology

Theoretically analyzes SynthID-Text LLM watermarking and proposes a layer inflation attack that defeats its mean-score detection scheme.

Output Integrity Attack nlp

defense arXiv Jan 9, 2025 · Jan 2025

Jane Downer, Ren Wang, Binghui Wang · Illinois Institute of Technology

Embeds ownership watermarks in GNN explanation behavior to prove model IP, surviving fine-tuning and pruning attacks

Model Theft graph

defense arXiv Mar 18, 2026 · 19d ago

Haozheng Luo, Yimin Wang, Jiahao Yu et al. · Northwestern University · University of Michigan +1 more

Aligns reasoning models against jailbreaks by optimizing safety in hidden representation space using contrastive RL

Prompt Injection nlp

benchmark arXiv Mar 6, 2026 · 4w ago

Qitong Wang, Haoran Dai, Haotian Zhang et al. · University of Delaware · Illinois Institute of Technology +1 more

Introduces metrics revealing that multimodal backdoor attacks collapse to single-modality dominance rather than exploiting modalities synergistically

Model Poisoning multimodalgenerative

attack arXiv Aug 3, 2025 · Aug 2025

Haoran Dai, Jiawen Wang, Ruo Yang et al. · Illinois Institute of Technology · Samsung +2 more

Backdoor attack on text-to-image diffusion models achieving >90% success with only 10 poisoned samples and natural-language triggers

Model Poisoning Data Poisoning Attack visionnlpgenerative

Papers in Database (5)