defense 2025

CTCC: A Robust and Stealthy Fingerprinting Framework for Large Language Models via Cross-Turn Contextual Correlation Backdoor

Zhenhua Xu 1, Xixiang Zhao 2, Xubin Yue 1, Shengwei Tian 3, Changting Lin 1,3, Meng Han 1,3

0 citations

α

Published on arXiv

2509.09703

Model Theft

OWASP ML Top 10 — ML05

Model Theft

OWASP LLM Top 10 — LLM10

Key Finding

CTCC consistently achieves stronger stealthiness and robustness than prior fingerprinting methods across multiple LLM architectures, resisting both perplexity-based input detection and adversarial post-deployment modifications while supporting fingerprint verification under black-box access.

CTCC (Cross-Turn Contextual Correlation)

Novel technique introduced


The widespread deployment of large language models (LLMs) has intensified concerns around intellectual property (IP) protection, as model theft and unauthorized redistribution become increasingly feasible. To address this, model fingerprinting aims to embed verifiable ownership traces into LLMs. However, existing methods face inherent trade-offs between stealthness, robustness, and generalizability, being either detectable via distributional shifts, vulnerable to adversarial modifications, or easily invalidated once the fingerprint is revealed. In this work, we introduce CTCC, a novel rule-driven fingerprinting framework that encodes contextual correlations across multiple dialogue turns, such as counterfactual, rather than relying on token-level or single-turn triggers. CTCC enables fingerprint verification under black-box access while mitigating false positives and fingerprint leakage, supporting continuous construction under a shared semantic rule even if partial triggers are exposed. Extensive experiments across multiple LLM architectures demonstrate that CTCC consistently achieves stronger stealth and robustness than prior work. Our findings position CTCC as a reliable and practical solution for ownership verification in real-world LLM deployment scenarios. Our code and data are publicly available at <https://github.com/Xuzhenhua55/CTCC>.


Key Contributions

  • Cross-Turn Contextual Correlation (CTCC) backdoor that distributes fingerprint trigger conditions across multiple dialogue turns, activating only when a structured semantic predicate (e.g., counterfactual inconsistency) is satisfied across the conversation history
  • Rule-driven fingerprint design that supports continuous fingerprint construction under a shared semantic rule even after partial trigger exposure, mitigating fingerprint leakage
  • Demonstrated superior stealthiness (resists perplexity-based detection) and robustness (resists adversarial modifications such as fine-tuning and model merging) over prior LLM fingerprinting methods

🛡️ Threat Analysis

Model Theft

CTCC embeds an invasive backdoor-based fingerprint inside LLM weights to prove model ownership under black-box access — a direct defense against model theft and unauthorized redistribution. The fingerprint is in the MODEL to verify IP, not in the outputs for content provenance.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
black_boxtraining_time
Applications
llm intellectual property protectionmodel ownership verificationblack-box api model fingerprinting