ML07
Transfer Learning Attack
Exploiting fine-tuning and transfer learning vulnerabilities
86 papers Browse all papers
Monthly publications
Paper types
defense 43
attack 31
benchmark 11
tool 1
Domains
nlp 74
vision 15
multimodal 8
generative 7
federated-learning 3
reinforcement-learning 2
graph 2
audio 1
Co-occurring categories
Other OWASP categories that appear on the same papers
Top cited papers
1392535445363738393103
SaLoRA: Safety-Alignment Preserved Low-Rank Adaptation
2025 defense
Adaptive Defense against Harmful Fine-Tuning for Large Language Models via Bayesian Data Scheduler
2025 defense
Attack via Overfitting: 10-shot Benign Fine-tuning to Jailbreak LLMs
2025 attack
Eliciting Harmful Capabilities by Fine-Tuning On Safeguarded Outputs
2026 attack
From Narrow Unlearning to Emergent Misalignment: Causes, Consequences, and Containment in LLMs
2025 defense
Open-weight genome language model safeguards: Assessing robustness via adversarial fine-tuning
2025 benchmark
Patronus: Identifying and Mitigating Transferable Backdoors in Pre-trained Language Models
2025 defense
Detecting Adversarial Fine-tuning with Auditing Agents
2025 tool
LSSF: Safety Alignment for Large Language Models through Low-Rank Safety Subspace Fusion
2026 defense
Defending MoE LLMs against Harmful Fine-Tuning via Safety Routing Alignment
2025 defense