Shojiro Yamabe

h-index: 2 14 citations 7 papers (total)

Papers in Database (2)

defense arXiv Oct 1, 2025 · Oct 2025

Toward Safer Diffusion Language Models: Discovery and Mitigation of Priming Vulnerability

Shojiro Yamabe, Jun Sakuma · Institute of Science Tokyo · RIKEN

Discovers token-injection jailbreak in diffusion LMs and proposes safety alignment to defend contaminated intermediate denoising states

Input Manipulation Attack Prompt Injection nlp

PDF Code

benchmark arXiv Oct 1, 2025 · Oct 2025

Understanding Sensitivity of Differential Attention through the Lens of Adversarial Robustness

Tsubasa Takahashi, Shojiro Yamabe, Futa Waseda et al. · Turing Inc. · Institute of Science Tokyo +2 more

Reveals Differential Attention transformers are structurally more fragile to adversarial perturbations than standard attention via negative gradient alignment theory

Input Manipulation Attack visionmultimodal

PDF