attack 2026

Deep Dive into the Abuse of DL APIs To Create Malicious AI Models and How to Detect Them

Mohamed Nabeel , Oleksii Starov

0 citations · 13 references · arXiv

α

Published on arXiv

2601.04553

AI Supply Chain Attacks

OWASP ML Top 10 — ML06

Key Finding

Existing model hub scanners (HuggingFace, TensorFlow Hub) cannot detect stealthy TensorFlow API abuse attacks; LLM-based semantic analysis of API functionality can surface previously undetected attack vectors.

LLM-based DL API abuse scanner

Novel technique introduced


According to Gartner, more than 70% of organizations will have integrated AI models into their workflows by the end of 2025. In order to reduce cost and foster innovation, it is often the case that pre-trained models are fetched from model hubs like Hugging Face or TensorFlow Hub. However, this introduces a security risk where attackers can inject malicious code into the models they upload to these hubs, leading to various kinds of attacks including remote code execution (RCE), sensitive data exfiltration, and system file modification when these models are loaded or executed (predict function). Since AI models play a critical role in digital transformation, this would drastically increase the number of software supply chain attacks. While there are several efforts at detecting malware when deserializing pickle based saved models (hiding malware in model parameters), the risk of abusing DL APIs (e.g. TensorFlow APIs) is understudied. Specifically, we show how one can abuse hidden functionalities of TensorFlow APIs such as file read/write and network send/receive along with their persistence APIs to launch attacks. It is concerning to note that existing scanners in model hubs like Hugging Face and TensorFlow Hub are unable to detect some of the stealthy abuse of such APIs. This is because scanning tools only have a syntactically identified set of suspicious functionality that is being analysed. They often do not have a semantic-level understanding of the functionality utilized. After demonstrating the possible attacks, we show how one may identify potentially abusable hidden API functionalities using LLMs and build scanners to detect such abuses.


Key Contributions

  • Demonstrates novel attack vectors abusing hidden TensorFlow API functionalities (file read/write, network send/receive, persistence APIs) to embed RCE and data exfiltration payloads into model files distributed via model hubs
  • Shows that existing syntactic scanners on HuggingFace and TensorFlow Hub fail to detect these stealthy API abuse attacks due to lack of semantic understanding
  • Proposes an LLM-based semantic analysis approach to identify potentially abusable DL API functionalities and build more effective model hub scanners

🛡️ Threat Analysis

AI Supply Chain Attacks

The paper's core contribution is demonstrating attacks via trojaned models distributed on model hubs (HuggingFace, TensorFlow Hub) — a textbook ML supply chain attack. Attackers abuse DL framework APIs (TF file I/O, network send/receive, persistence APIs) to embed malware that executes at model load/inference time, exactly matching 'Trojaned pre-trained models on model hubs.' The detection tool addresses the supply chain vulnerability directly.


Details

Threat Tags
training_timedigital
Applications
model hub securityml supply chain securitydeep learning framework security