defense 2025

EverTracer: Hunting Stolen Large Language Models via Stealthy and Robust Probabilistic Fingerprint

Zhenhua Xu 1, Meng Han 1,2, Wenpeng Xing 1,2

0 citations

α

Published on arXiv

2509.03058

Model Theft

OWASP ML Top 10 — ML05

Model Theft

OWASP LLM Top 10 — LLM10

Key Finding

EverTracer achieves state-of-the-art stealthiness and robustness across multiple LLM architectures in gray-box settings, outperforming existing backdoor-based and optimization-based fingerprinting methods against adaptive adversaries

EverTracer

Novel technique introduced


The proliferation of large language models (LLMs) has intensified concerns over model theft and license violations, necessitating robust and stealthy ownership verification. Existing fingerprinting methods either require impractical white-box access or introduce detectable statistical anomalies. We propose EverTracer, a novel gray-box fingerprinting framework that ensures stealthy and robust model provenance tracing. EverTracer is the first to repurpose Membership Inference Attacks (MIAs) for defensive use, embedding ownership signals via memorization instead of artificial trigger-output overfitting. It consists of Fingerprint Injection, which fine-tunes the model on any natural language data without detectable artifacts, and Verification, which leverages calibrated probability variation signal to distinguish fingerprinted models. This approach remains robust against adaptive adversaries, including input level modification, and model-level modifications. Extensive experiments across architectures demonstrate EverTracer's state-of-the-art effectiveness, stealthness, and resilience, establishing it as a practical solution for securing LLM intellectual property. Our code and data are publicly available at https://github.com/Xuzhenhua55/EverTracer.


Key Contributions

  • First framework to repurpose Membership Inference Attack mechanics as a defensive fingerprinting signal, embedding ownership via natural-language memorization instead of artificial trigger-output pairs
  • Gray-box fingerprinting that avoids perplexity-detectable statistical anomalies while remaining robust against input-level and model-level adversarial modifications (fine-tuning, merging, pruning)
  • Calibrated probability variation-based verification signal that isolates memorization patterns from general data frequency biases to confirm model provenance

🛡️ Threat Analysis

Model Theft

EverTracer embeds ownership fingerprints IN THE MODEL via fine-tuning, then verifies ownership against suspect (stolen) model copies — this is directly model IP protection and anti-theft fingerprinting. The watermark is in the model, not the content outputs.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
grey_boxinference_time
Applications
llm intellectual property protectionmodel ownership verificationmodel provenance tracing