Latest papers

3 papers
defense arXiv Apr 21, 2026 · 4w ago

Distillation Traps and Guards: A Calibration Knob for LLM Distillability

Weixiao Zhan, Yongcheng Jing, Leszek Rutkowski et al. · Nanyang Technological University · Polish Academy of Sciences +2 more

Calibrates teacher LLMs to be either distillable for better knowledge transfer or undistillable for model IP protection

Model Theft Model Theft nlp
PDF
attack arXiv Nov 10, 2025 · Nov 2025

On Stealing Graph Neural Network Models

Marcin Podhajski, Jan Dubiński, Franziska Boenisch et al. · Polish Academy of Sciences · IDEAS NCBR +5 more

Steals GNN models with as few as 100 queries by decoupling query-free backbone extraction from strategic head extraction

Model Theft graph
PDF Code
benchmark arXiv Aug 15, 2025 · Aug 2025

Semantically Guided Adversarial Testing of Vision Models Using Language Models

Katarzyna Filus, Jorge M. Cruz-Duarte · Polish Academy of Sciences · University of Lille +3 more

Semantically guided target label selection using BERT/CLIP/TinyLLAMA improves adversarial benchmarking interpretability and scalability over WordNet

Input Manipulation Attack visionnlp
PDF Code