Latest papers

3 papers
benchmark arXiv Jan 4, 2026 · Jan 2026

JMedEthicBench: A Multi-Turn Conversational Benchmark for Evaluating Medical Safety in Japanese Large Language Models

Junyu Liu, Zirui Li, Qian Niu et al. · Kyoto University · Hohai University +3 more

Benchmarks 27 LLMs against 50K+ multi-turn medical jailbreak conversations in Japanese, finding fine-tuned medical models are most vulnerable

Prompt Injection nlp
PDF
defense arXiv Dec 15, 2025 · Dec 2025

Learning to Generate Cross-Task Unexploitable Examples

Haoxuan Qu, Qiuchi Xiang, Yujun Cai et al. · Lancaster University · The University of Queensland +2 more

Defends personal images from unauthorized ML training by generating cross-task imperceptible perturbations that make training data unlearnable across diverse vision tasks

Data Poisoning Attack vision
PDF
defense arXiv Nov 10, 2025 · Nov 2025

Certified L2-Norm Robustness of 3D Point Cloud Recognition in the Frequency Domain

Liang Zhou, Qiming Wang, Tianze Chen · Hohai University

FreqCert defends 3D point cloud classifiers against L2 adversarial perturbations via frequency-domain certification and majority voting

Input Manipulation Attack vision
PDF