Ghost in the Transformer: Detecting Model Reuse with Invariant Spectral Signatures

Large Language Models (LLMs) are widely adopted, but their high training cost leads many developers to fine-tune existing open-source models. While most adhere to open-source licenses, some falsely claim original training despite clear derivation from public models, raising pressing concerns about intellectual property protection and the need to verify model provenance. In this paper, we propose GhostSpec, a lightweight yet effective method for verifying LLM lineage without access to training data or modification of model behavior. Our approach constructs compact and robust fingerprints by applying singular value decomposition (SVD) to invariant products of internal attention weight matrices. Unlike watermarking or output-based methods, GhostSpec is fully data-free, non-invasive, and computationally efficient. Extensive experiments show it is robust to fine-tuning, pruning, expansion, and adversarial transformations, reliably tracing lineage with minimal overhead. By offering a practical solution for model verification, our method contributes to intellectual property protection and fosters a transparent, trustworthy LLM ecosystem. Our code is available at https://github.com/DX0369/GhostSpec.

Key Contributions

GhostSpec: a data-free, non-invasive white-box method for LLM lineage verification using SVD-based spectral fingerprints of attention weight matrix products (QK and VO)
Spectral fingerprints invariant to scaling and permutation transformations, robust to fine-tuning, pruning, expansion, and adversarial modifications
POSA (Penalty-based Optimal Spectral Alignment) algorithm for comparing models with differing depths and architectures

🛡️ Threat Analysis

Model Theft

GhostSpec is a model fingerprinting/IP protection method — it detects whether a model was cloned or derived from another LLM by extracting intrinsic spectral signatures from attention weight matrices. This is a defense against model theft (IP reuse without attribution), precisely the 'model fingerprinting to detect clones' use case under ML05.

Details

Domains

nlp

Model Types

llmtransformer

Threat Tags

white_box

Applications

2025 0 cit.

Model Theft

73%