benchmark 2025

VFLAIR-LLM: A Comprehensive Framework and Benchmark for Split Learning of LLMs

Zixuan Gu 1, Qiufeng Fan 1, Long Sun 1, Yang Liu 2,3, Xiaojun Ye 1

0 citations · In Proceedings of the 31st ACM...

α

Published on arXiv

2508.03097

Model Inversion Attack

OWASP ML Top 10 — ML03

Membership Inference Attack

OWASP ML Top 10 — ML04

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Key Finding

Benchmarking 5 attacks and 9 defenses across varied SL-LLM partition configurations yields actionable recommendations on defense strategy selection and hyperparameter tuning for secure split learning of LLMs.

VFLAIR-LLM

Novel technique introduced


With the advancement of Large Language Models (LLMs), LLM applications have expanded into a growing number of fields. However, users with data privacy concerns face limitations in directly utilizing LLM APIs, while private deployments incur significant computational demands. This creates a substantial challenge in achieving secure LLM adaptation under constrained local resources. To address this issue, collaborative learning methods, such as Split Learning (SL), offer a resource-efficient and privacy-preserving solution for adapting LLMs to private domains. In this study, we introduce VFLAIR-LLM (available at https://github.com/FLAIR-THU/VFLAIR-LLM), an extensible and lightweight split learning framework for LLMs, enabling privacy-preserving LLM inference and fine-tuning in resource-constrained environments. Our library provides two LLM partition settings, supporting three task types and 18 datasets. In addition, we provide standard modules for implementing and evaluating attacks and defenses. We benchmark 5 attacks and 9 defenses under various Split Learning for LLM(SL-LLM) settings, offering concrete insights and recommendations on the choice of model partition configurations, defense strategies, and relevant hyperparameters for real-world applications.


Key Contributions

  • VFLAIR-LLM: an open-source, extensible split learning framework supporting LLM inference and fine-tuning under resource constraints with two partition settings and 18 datasets
  • Systematic benchmark of 5 attacks (feature/label inference) and 9 defenses across multiple SL-LLM configurations
  • Concrete insights and recommendations on model partition configurations, defense strategies, and hyperparameter choices for real-world privacy-preserving LLM deployment

🛡️ Threat Analysis

Model Inversion Attack

Feature inference attacks in split learning involve a server-side adversary reconstructing the client's private input data from intermediate smashed activations — a direct form of model inversion / data reconstruction attack. This is the primary attack vector benchmarked in the SL-LLM setting.

Membership Inference Attack

Label inference attacks, also benchmarked in this framework, involve inferring the private labels of client training data from intermediate representations — a membership/attribute inference threat distinct from full data reconstruction.


Details

Domains
nlp
Model Types
llmtransformer
Threat Tags
grey_boxtraining_timeinference_time
Datasets
18 datasets across three task types (unspecified in abstract)
Applications
llm fine-tuningprivacy-preserving inferencesplit learning