defense 2025

RTLMarker: Protecting LLM-Generated RTL Copyright via a Hardware Watermarking Framework

Kun Wang 1,2, Kaiyan Chang 1,2, Mengdi Wang 1,2, Xinqi Zou 1, Haobo Xu 1, Yinhe Han 1, Ying Wang 1

5 citations · 17 references · Asia and South Pacific Design ... · Open Access

α

Published on arXiv

2501.02446

Output Integrity Attack

OWASP ML Top 10 — ML09

Key Finding

RTLMarker outperforms baseline watermarking approaches for RTL code while jointly optimizing the tradeoff between transparency and watermark effectiveness in both source code and synthesized netlists

RTLMarker

Novel technique introduced


Recent advances of large language models in the field of Verilog generation have raised several ethical and security concerns, such as code copyright protection and dissemination of malicious code. Researchers have employed watermarking techniques to identify codes generated by large language models. However, the existing watermarking works fail to protect RTL code copyright due to the significant syntactic and semantic differences between RTL code and software code in languages such as Python. This paper proposes a hardware watermarking framework RTLMarker that embeds watermarks into RTL code and deeper into the synthesized netlist. We propose a set of rule-based Verilog code transformations , ensuring the watermarked RTL code's syntactic and semantic correctness. In addition, we consider an inherent tradeoff between watermark transparency and watermark effectiveness and jointly optimize them. The results demonstrate RTLMarker's superiority over the baseline in RTL code watermarking.


Key Contributions

  • RTLMarker framework that embeds watermarks at two levels: RTL source code (via rule-based Verilog transformations) and synthesized netlists
  • Rule-based Verilog code transformations that preserve syntactic and semantic correctness of watermarked RTL
  • Joint optimization of watermark transparency and effectiveness to balance imperceptibility with robustness

🛡️ Threat Analysis

Output Integrity Attack

RTLMarker embeds watermarks in LLM-generated outputs (RTL/Verilog code and synthesized netlists) to trace content provenance and protect copyright — this is content watermarking of model outputs, not model weight watermarking. Fits ML09's scope of output integrity and AI-generated content attribution.


Details

Domains
nlp
Model Types
llm
Threat Tags
inference_time
Applications
rtl/verilog code generationhardware design copyright protectionllm-generated code attribution