SoK: Understanding (New) Security Issues Across AI4Code Use Cases

AI-for-Code (AI4Code) systems are reshaping software engineering, with tools like GitHub Copilot accelerating code generation, translation, and vulnerability detection. Alongside these advances, however, security risks remain pervasive: insecure outputs, biased benchmarks, and susceptibility to adversarial manipulation undermine their reliability. This SoK surveys the landscape of AI4Code security across three core applications, identifying recurring gaps: benchmark dominance by Python and toy problems, lack of standardized security datasets, data leakage in evaluation, and fragile adversarial robustness. A comparative study of six state-of-the-art models illustrates these challenges: insecure patterns persist in code generation, vulnerability detection is brittle to semantic-preserving attacks, fine-tuning often misaligns security objectives, and code translation yields uneven security benefits. From this analysis, we distill three forward paths: embedding secure-by-default practices in code generation, building robust and comprehensive detection benchmarks, and leveraging translation as a route to security-enhanced languages. We call for a shift toward security-first AI4Code, where vulnerability mitigation and robustness are embedded throughout the development life cycle.

Key Contributions

Systematic survey of security issues across three AI4Code applications (code generation, vulnerability detection, code translation) with comparative evaluation of six SOTA models
Identification of recurring structural gaps: benchmark bias toward Python/toy problems, absence of standardized security datasets, and data leakage in evaluation pipelines
Evidence that vulnerability detection is brittle to semantic-preserving adversarial attacks, fine-tuning misaligns security objectives, and code translation yields inconsistent security benefits

🛡️ Threat Analysis

Input Manipulation Attack

Explicitly studies and demonstrates that vulnerability detection models are brittle to semantic-preserving adversarial code transformations at inference time — a direct adversarial evasion attack on ML classifiers.