Chunrong Fang

Papers in Database (4)

defense arXiv Apr 12, 2026 · 5w ago

DuCodeMark: Dual-Purpose Code Dataset Watermarking via Style-Aware Watermark-Poison Design

Yuchen Chen, Yuan Xiao, Chunrong Fang et al. · Nanjing University

Embeds ownership watermarks in code training datasets using AST-based style triggers plus poisoned samples that degrade model performance if watermark is removed

Output Integrity Attack Model Poisoning nlp
PDF
defense arXiv Mar 25, 2026 · 8w ago

Enhancing and Reporting Robustness Boundary of Neural Code Models for Intelligent Code Understanding

Tingxu Han, Wei Song, Weisong Sun et al. · Nanjing University · University of New South Wales +2 more

Black-box certified defense for code models using randomized smoothing to reduce adversarial attack success from 42% to 9.74%

Input Manipulation Attack nlp
PDF
defense arXiv Apr 30, 2026 · 21d ago

PuzzleMark: Implicit Jigsaw Learning for Robust Code Dataset Watermarking in Neural Code Completion Models

Haocheng Huang, Yuchen Chen, Weisong Sun et al. · Soochow University · Nanjing University +1 more

Dataset watermarking scheme embedding stealth marks in code via variable name patterns to prove training data ownership

Output Integrity Attack nlp
PDF
defense arXiv Apr 24, 2026 · 27d ago

Train in Vain: Functionality-Preserving Poisoning to Prevent Unauthorized Use of Code Datasets

Yuan Xiao, Jiaming Wang, Yuchen Chen et al. · Nanjing University · University of New South Wales +3 more

Dataset poisoning defense that injects compilable, functionality-preserving code fragments to degrade CodeLLM training with only 10% contamination

Data Poisoning Attack Training Data Poisoning nlp
PDF