benchmark 2025

A Systematic Evaluation of Parameter-Efficient Fine-Tuning Methods for the Security of Code LLMs

Kiho Lee 1, Jungkon Kim 2, Doowon Kim 3, Hyoungshick Kim 4

0 citations

α

Published on arXiv

2509.12649

Model Poisoning

OWASP ML Top 10 — ML10

Key Finding

Prompt-tuning on CodeGen2 16B achieves 80.86% Overall-Secure-Rate (+13.5 pp over baseline) and provides the strongest resistance against TrojanPuzzle backdoor attacks across CWE-79 and CWE-502 vectors

Prompt-tuning for Secure Code LLMs

Novel technique introduced


Code-generating Large Language Models (LLMs) significantly accelerate software development. However, their frequent generation of insecure code presents serious risks. We present a comprehensive evaluation of seven parameter-efficient fine-tuning (PEFT) techniques, demonstrating substantial gains in secure code generation without compromising functionality. Our research identifies prompt-tuning as the most effective PEFT method, achieving an 80.86% Overall-Secure-Rate on CodeGen2 16B, a 13.5-point improvement over the 67.28% baseline. Optimizing decoding strategies through sampling temperature further elevated security to 87.65%. This equates to a reduction of approximately 203,700 vulnerable code snippets per million generated. Moreover, prompt and prefix tuning increase robustness against poisoning attacks in our TrojanPuzzle evaluation, with strong performance against CWE-79 and CWE-502 attack vectors. Our findings generalize across Python and Java, confirming prompt-tuning's consistent effectiveness. This study provides essential insights and practical guidance for building more resilient software systems with LLMs.


Key Contributions

  • Systematic evaluation of seven PEFT methods (including prompt-tuning, prefix tuning, LoRA) for secure code generation across Python and Java on models up to CodeGen2 16B
  • Identifies prompt-tuning as the most effective PEFT method, achieving an 80.86% Overall-Secure-Rate—a 13.5-point improvement over the baseline—further raised to 87.65% via temperature sampling optimization
  • Demonstrates that prompt and prefix tuning increase robustness against TrojanPuzzle backdoor poisoning attacks targeting CWE-79 and CWE-502 vulnerability classes

🛡️ Threat Analysis

Model Poisoning

Paper evaluates PEFT techniques as defenses against TrojanPuzzle—a data-poisoning-based backdoor attack on code LLMs—measuring robustness against CWE-79 and CWE-502 backdoor triggers; prompt and prefix tuning are identified as the most effective countermeasures against this trojan injection threat.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
training_timetargeted
Datasets
CodeGen2 16BSecurityEvalTrojanPuzzle benchmark
Applications
code generationsecure software development