defense 2025

Ghost in the Transformer: Detecting Model Reuse with Invariant Spectral Signatures

Suqing Wang , Ziyang Ma , Li Xinyi , Zuchao Li

0 citations · 34 references · arXiv (Cornell University)

α

Published on arXiv

2511.06390

Model Theft

OWASP ML Top 10 — ML05

Key Finding

GhostSpec reliably distinguishes derivative LLMs from independently trained models under challenging modifications including fine-tuning, pruning, and expansion, with minimal computational overhead.

GhostSpec / POSA

Novel technique introduced


Large Language Models (LLMs) are widely adopted, but their high training cost leads many developers to fine-tune existing open-source models. While most adhere to open-source licenses, some falsely claim original training despite clear derivation from public models, raising pressing concerns about intellectual property protection and the need to verify model provenance. In this paper, we propose GhostSpec, a lightweight yet effective method for verifying LLM lineage without access to training data or modification of model behavior. Our approach constructs compact and robust fingerprints by applying singular value decomposition (SVD) to invariant products of internal attention weight matrices. Unlike watermarking or output-based methods, GhostSpec is fully data-free, non-invasive, and computationally efficient. Extensive experiments show it is robust to fine-tuning, pruning, expansion, and adversarial transformations, reliably tracing lineage with minimal overhead. By offering a practical solution for model verification, our method contributes to intellectual property protection and fosters a transparent, trustworthy LLM ecosystem. Our code is available at https://github.com/DX0369/GhostSpec.


Key Contributions

  • GhostSpec: a data-free, non-invasive white-box method for LLM lineage verification using SVD-based spectral fingerprints of attention weight matrix products (QK and VO)
  • Spectral fingerprints invariant to scaling and permutation transformations, robust to fine-tuning, pruning, expansion, and adversarial modifications
  • POSA (Penalty-based Optimal Spectral Alignment) algorithm for comparing models with differing depths and architectures

🛡️ Threat Analysis

Model Theft

GhostSpec is a model fingerprinting/IP protection method — it detects whether a model was cloned or derived from another LLM by extracting intrinsic spectral signatures from attention weight matrices. This is a defense against model theft (IP reuse without attribution), precisely the 'model fingerprinting to detect clones' use case under ML05.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
white_box
Applications
llm provenance verificationmodel ip protectionopen-source license compliance