Latest papers

2 papers
benchmark arXiv Mar 8, 2026 · 29d ago

DistillGuard: Evaluating Defenses Against LLM Knowledge Distillation

Bo Jiang · Temple University

Systematically evaluates nine output-level defenses against LLM distillation theft, finding most fail except chain-of-thought removal for math

Model Theft Model Theft nlp
PDF
attack arXiv Nov 12, 2025 · Nov 2025

Boosting Adversarial Transferability via Ensemble Non-Attention

Yipeng Zou, Qin Liu, Jie Wu et al. · Hunan University · China Telecom +2 more

Ensemble adversarial attack leveraging non-attention regions and meta-learning to boost black-box transferability across CNNs and ViTs

Input Manipulation Attack vision
PDF