defense 2025

NeuPerm: Disrupting Malware Hidden in Neural Network Parameters by Leveraging Permutation Symmetry

Daniel Gilkarov , Ran Dubin

1 citations · 49 references · arXiv

α

Published on arXiv

2510.20367

AI Supply Chain Attacks

OWASP ML Top 10 — ML06

Key Finding

NeuPerm successfully disrupts state-of-the-art stegomalware attacks including error-correcting MaleficNet with near-zero model performance impact, and generalizes to LLMs — a capability no prior disruption method achieved.

NeuPerm

Novel technique introduced


Pretrained deep learning model sharing holds tremendous value for researchers and enterprises alike. It allows them to apply deep learning by fine-tuning models at a fraction of the cost of training a brand-new model. However, model sharing exposes end-users to cyber threats that leverage the models for malicious purposes. Attackers can use model sharing by hiding self-executing malware inside neural network parameters and then distributing them for unsuspecting users to unknowingly directly execute them, or indirectly as a dependency in another software. In this work, we propose NeuPerm, a simple yet effec- tive way of disrupting such malware by leveraging the theoretical property of neural network permutation symmetry. Our method has little to no effect on model performance at all, and we empirically show it successfully disrupts state-of-the-art attacks that were only previously addressed using quantization, a highly complex process. NeuPerm is shown to work on LLMs, a feat that no other previous similar works have achieved. The source code is available at https://github.com/danigil/NeuPerm.git.


Key Contributions

  • NeuPerm: a permutation-symmetry-based sanitization method that disrupts steganographic malware hidden in neural network weights with negligible model performance impact
  • First defense shown to defeat error-correcting steganography (MaleficNet) without the performance degradation of quantization-based approaches
  • Demonstrated effectiveness on LLMs, where prior disruption methods had not been validated

🛡️ Threat Analysis

AI Supply Chain Attacks

The core threat is steganographic malware hidden in pre-trained model weights and distributed via PTM hubs (HuggingFace, TensorFlow Hub) — a canonical ML supply chain attack. NeuPerm is a defense that disrupts these embedded payloads before the model reaches end-users. The paper explicitly cites ML supply chain compromise as the threat vector and references OWASP LLM Top 10 and MITRE ATLAS in that context.


Details

Domains
visionnlp
Model Types
cnnllmtransformer
Threat Tags
training_timedigital
Applications
pre-trained model sharingmodel hubsllm deployment