defense 2026

SecPI: Secure Code Generation with Reasoning Models via Security Reasoning Internalization

Hao Wang 1, Niels Mündler 2, Mark Vero 2, Jingxuan He 1, Dawn Song 1, Martin Vechev 2

0 citations

α

Published on arXiv

2604.03587

Prompt Injection

OWASP LLM Top 10 — LLM01

Key Finding

Improves QwQ 32B secure+correct generation from 48.2% to 62.2% on CWEval and 18.2% to 22.0% on BaxBench; shows 9.9% improvement on held-out memory-safety CWEs when trained only on injection CWEs

SecPI

Novel technique introduced


Reasoning language models (RLMs) are increasingly used in programming. Yet, even state-of-the-art RLMs frequently introduce critical security vulnerabilities in generated code. Prior training-based approaches for secure code generation face a critical limitation that prevents their direct application to RLMs: they rely on costly, manually curated security datasets covering only a limited set of vulnerabilities. At the inference level, generic security reminders consistently degrade functional correctness while triggering only shallow ad-hoc vulnerability analysis. To address these problems, we present SecPI, a fine-tuning pipeline that teaches RLMs to internalize structured security reasoning, producing secure code by default without any security instructions at inference time. SecPI filters existing general-purpose coding datasets for security-relevant tasks using an LLM-based classifier, generates high-quality security reasoning traces with a teacher model guided by a structured prompt that systematically enumerates relevant CWEs and mitigations, and fine-tunes the target model on pairs of inputs with no security prompt and teacher reasoning traces -- as a result, the model learns to reason about security autonomously rather than in response to explicit instructions. An extensive evaluation on security benchmarks with state-of-the-art open-weight reasoning models validates the effectiveness of our approach. For instance, SecPI improves the percentage of functionally correct and secure generations for QwQ 32B from 48.2% to 62.2% (+14.0 points) on CWEval and from 18.2% to 22.0% on BaxBench. Further investigation also reveals strong cross-CWE and cross-language generalization beyond training vulnerabilities. Even when trained only on injection-related CWEs, QwQ 32B generates correct and secure code 9.9% more frequently on held-out memory-safety CWEs.


Key Contributions

  • SecPI fine-tuning pipeline that teaches RLMs to internalize structured security reasoning without requiring security prompts at inference
  • LLM-based filtering of general coding datasets for security-relevant tasks and generation of high-quality security reasoning traces guided by CWE enumeration
  • Demonstrates strong cross-CWE and cross-language generalization - models trained only on injection vulnerabilities improve on held-out memory-safety vulnerabilities

🛡️ Threat Analysis


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
inference_time
Datasets
CWEvalBaxBench
Applications
code generationsecure programming assistants