IrisFP: Adversarial-Example-based Model Fingerprinting with Enhanced Uniqueness and Robustness

We propose IrisFP, a novel adversarial-example-based model fingerprinting framework that enhances both uniqueness and robustness by leveraging multi-boundary characteristics, multi-sample behaviors, and fingerprint discriminative power assessment to generate composite-sample fingerprints. Three key innovations make IrisFP outstanding: 1) It positions fingerprints near the intersection of all decision boundaries - unlike prior methods that target a single boundary - thus increasing the prediction margin without placing fingerprints deep inside target class regions, enhancing both robustness and uniqueness; 2) It constructs composite-sample fingerprints, each comprising multiple samples close to the multi-boundary intersection, to exploit collective behavior patterns and further boost uniqueness; and 3) It assesses the discriminative power of generated fingerprints using statistical separability metrics developed based on two reference model sets, respectively, for pirated and independently-trained models, retains the fingerprints with high discriminative power, and assigns fingerprint-specific thresholds to such retained fingerprints. Extensive experiments show that IrisFP consistently outperforms state-of-the-art methods, achieving reliable ownership verification by enhancing both robustness and uniqueness.

Key Contributions

Multi-boundary fingerprinting positioning samples near intersections of all decision boundaries rather than single boundaries
Composite-sample fingerprints exploiting collective behavior patterns across multiple samples
Fingerprint discriminative power assessment using statistical separability metrics with fingerprint-specific verification thresholds

🛡️ Threat Analysis

Model Theft

Model fingerprinting to detect stolen/pirated models and verify ownership - directly addresses model theft defense by embedding verification triggers in the model.

Details

Domains

vision

Model Types

cnn

Threat Tags

inference_timedigital

Applications

2025 0 cit.

Model Theft

73%