Ghost in the Transformer: Detecting Model Reuse with Invariant Spectral Signatures
Suqing Wang , Ziyang Ma , Li Xinyi , Zuchao Li
Published on arXiv
2511.06390
Model Theft
OWASP ML Top 10 — ML05
Key Finding
GhostSpec reliably distinguishes derivative LLMs from independently trained models under challenging modifications including fine-tuning, pruning, and expansion, with minimal computational overhead.
GhostSpec / POSA
Novel technique introduced
Large Language Models (LLMs) are widely adopted, but their high training cost leads many developers to fine-tune existing open-source models. While most adhere to open-source licenses, some falsely claim original training despite clear derivation from public models, raising pressing concerns about intellectual property protection and the need to verify model provenance. In this paper, we propose GhostSpec, a lightweight yet effective method for verifying LLM lineage without access to training data or modification of model behavior. Our approach constructs compact and robust fingerprints by applying singular value decomposition (SVD) to invariant products of internal attention weight matrices. Unlike watermarking or output-based methods, GhostSpec is fully data-free, non-invasive, and computationally efficient. Extensive experiments show it is robust to fine-tuning, pruning, expansion, and adversarial transformations, reliably tracing lineage with minimal overhead. By offering a practical solution for model verification, our method contributes to intellectual property protection and fosters a transparent, trustworthy LLM ecosystem. Our code is available at https://github.com/DX0369/GhostSpec.
Key Contributions
- GhostSpec: a data-free, non-invasive white-box method for LLM lineage verification using SVD-based spectral fingerprints of attention weight matrix products (QK and VO)
- Spectral fingerprints invariant to scaling and permutation transformations, robust to fine-tuning, pruning, expansion, and adversarial modifications
- POSA (Penalty-based Optimal Spectral Alignment) algorithm for comparing models with differing depths and architectures
🛡️ Threat Analysis
GhostSpec is a model fingerprinting/IP protection method — it detects whether a model was cloned or derived from another LLM by extracting intrinsic spectral signatures from attention weight matrices. This is a defense against model theft (IP reuse without attribution), precisely the 'model fingerprinting to detect clones' use case under ML05.