Binghui Wang

Papers in Database (5)

attack arXiv Mar 3, 2026 · 4w ago

On Google's SynthID-Text LLM Watermarking System: Theoretical Analysis and Empirical Validation

Romina Omidi, Yun Dong, Binghui Wang · Illinois Institute of Technology

Theoretically analyzes SynthID-Text LLM watermarking and proposes a layer inflation attack that defeats its mean-score detection scheme.

Output Integrity Attack nlp
PDF Code
defense arXiv Jan 9, 2025 · Jan 2025

Watermarking Graph Neural Networks via Explanations for Ownership Protection

Jane Downer, Ren Wang, Binghui Wang · Illinois Institute of Technology

Embeds ownership watermarks in GNN explanation behavior to prove model IP, surviving fine-tuning and pruning attacks

Model Theft graph
PDF
defense arXiv Mar 18, 2026 · 19d ago

Contrastive Reasoning Alignment: Reinforcement Learning from Hidden Representations

Haozheng Luo, Yimin Wang, Jiahao Yu et al. · Northwestern University · University of Michigan +1 more

Aligns reasoning models against jailbreaks by optimizing safety in hidden representation space using contrastive RL

Prompt Injection nlp
PDF
benchmark arXiv Mar 6, 2026 · 4w ago

When One Modality Rules Them All: Backdoor Modality Collapse in Multimodal Diffusion Models

Qitong Wang, Haoran Dai, Haotian Zhang et al. · University of Delaware · Illinois Institute of Technology +1 more

Introduces metrics revealing that multimodal backdoor attacks collapse to single-modality dominance rather than exploiting modalities synergistically

Model Poisoning multimodalgenerative
PDF
attack arXiv Aug 3, 2025 · Aug 2025

Practical, Generalizable and Robust Backdoor Attacks on Text-to-Image Diffusion Models

Haoran Dai, Jiawen Wang, Ruo Yang et al. · Illinois Institute of Technology · Samsung +2 more

Backdoor attack on text-to-image diffusion models achieving >90% success with only 10 poisoned samples and natural-language triggers

Model Poisoning Data Poisoning Attack visionnlpgenerative
PDF