FNF: Functional Network Fingerprint for Large Language Models
Yiheng Liu 1, Junhao Ning 1, Sichen Xia 1, Haiyang Sun 1, Yang Yang 1, Hanyang Chi 1, Xiaohui Gao 1, Ning Qiang 2, Bao Ge 2, Junwei Han 1, Xintao Hu 1
Published on arXiv
2601.22692
Model Theft
OWASP ML Top 10 — ML05
Model Theft
OWASP LLM Top 10 — LLM10
Key Finding
Functional network activation patterns are highly consistent between LLMs sharing a common origin, enabling training-free ownership verification with only a few samples while remaining robust to fine-tuning, pruning, and architectural expansion.
FNF (Functional Network Fingerprint)
Novel technique introduced
The development of large language models (LLMs) is costly and has significant commercial value. Consequently, preventing unauthorized appropriation of open-source LLMs and protecting developers' intellectual property rights have become critical challenges. In this work, we propose the Functional Network Fingerprint (FNF), a training-free, sample-efficient method for detecting whether a suspect LLM is derived from a victim model, based on the consistency between their functional network activity. We demonstrate that models that share a common origin, even with differences in scale or architecture, exhibit highly consistent patterns of neuronal activity within their functional networks across diverse input samples. In contrast, models trained independently on distinct data or with different objectives fail to preserve such activity alignment. Unlike conventional approaches, our method requires only a few samples for verification, preserves model utility, and remains robust to common model modifications (such as fine-tuning, pruning, and parameter permutation), as well as to comparisons across diverse architectures and dimensionalities. FNF thus provides model owners and third parties with a simple, non-invasive, and effective tool for protecting LLM intellectual property. The code is available at https://github.com/WhatAboutMyStar/LLM_ACTIVATION.
Key Contributions
- Proposes FNF, a training-free and sample-efficient method that fingerprints LLMs using consistency of functional network neuronal activity across input samples
- Demonstrates that models sharing a common origin exhibit consistent activation patterns even across differing architectures and scales, while independently trained models do not
- Shows robustness against common evasion strategies including fine-tuning, pruning, and parameter permutation, as well as cross-architecture comparisons
🛡️ Threat Analysis
FNF is a model fingerprinting defense that identifies whether a suspect LLM was derived from a victim model by comparing internal neuronal activation patterns — directly protects against model theft and unauthorized appropriation of model IP, analogous to 'model fingerprinting to detect clones'.