attack 2025

MoEcho: Exploiting Side-Channel Attacks to Compromise User Privacy in Mixture-of-Experts LLMs

Ruyi Ding ¹, Tianhong Xu ², Xinyi Shen ³, Aidong Adam Ding ², Yunsi Fei ²

¹ Louisiana State University

² Northeastern University

³ Yale University

0 citations

Published on arXiv

2508.15036

Sensitive Information Disclosure

OWASP LLM Top 10 — LLM06

Key Finding

MoEcho demonstrates that the adaptive routing behavior intrinsic to MoE architectures creates exploitable hardware side channels on both CPUs and GPUs, enabling reconstruction of user prompts and model responses without direct model access

MoEcho

Novel technique introduced

The transformer architecture has become a cornerstone of modern AI, fueling remarkable progress across applications in natural language processing, computer vision, and multimodal learning. As these models continue to scale explosively for performance, implementation efficiency remains a critical challenge. Mixture of Experts (MoE) architectures, selectively activating specialized subnetworks (experts), offer a unique balance between model accuracy and computational cost. However, the adaptive routing in MoE architectures, where input tokens are dynamically directed to specialized experts based on their semantic meaning inadvertently opens up a new attack surface for privacy breaches. These input-dependent activation patterns leave distinctive temporal and spatial traces in hardware execution, which adversaries could exploit to deduce sensitive user data. In this work, we propose MoEcho, discovering a side channel analysis based attack surface that compromises user privacy on MoE based systems. Specifically, in MoEcho, we introduce four novel architectural side channels on different computing platforms, including Cache Occupancy Channels and Pageout+Reload on CPUs, and Performance Counter and TLB Evict+Reload on GPUs, respectively. Exploiting these vulnerabilities, we propose four attacks that effectively breach user privacy in large language models (LLMs) and vision language models (VLMs) based on MoE architectures: Prompt Inference Attack, Response Reconstruction Attack, Visual Inference Attack, and Visual Reconstruction Attack. MoEcho is the first runtime architecture level security analysis of the popular MoE structure common in modern transformers, highlighting a serious security and privacy threat and calling for effective and timely safeguards when harnessing MoE based models for developing efficient large scale AI services.

Key Contributions

Four novel architectural side channels targeting MoE routing: Cache Occupancy Channels and Pageout+Reload on CPUs, and Performance Counter and TLB Evict+Reload on GPUs
Four privacy attacks exploiting MoE expert activation patterns: Prompt Inference, Response Reconstruction, Visual Inference, and Visual Reconstruction
First runtime architecture-level security analysis demonstrating that input-dependent expert routing in MoE transformers constitutes a serious, previously unexamined privacy attack surface

🛡️ Threat Analysis

Details

Domains

nlpmultimodalvision

Model Types

llmvlmtransformermultimodal

Threat Tags

grey_boxinference_time

Applications

large language modelsvision language modelsmoe-based inference services

Read PDF arXiv

MoEcho: Exploiting Side-Channel Attacks to Compromise User Privacy in Mixture-of-Experts LLMs

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Defeating Cerberus: Concept-Guided Privacy-Leakage Mitigation in Multimodal Language Models

Neural Gate: Mitigating Privacy Risks in LVLMs via Neuron-Level Gradient Gating

OFFSIDE: Benchmarking Unlearning Misinformation in Multimodal Large Language Models

DP^2-VL: Private Photo Dataset Protection by Data Poisoning for Vision-Language Models

Anonymization-Enhanced Privacy Protection for Mobile GUI Agents: Available but Invisible

SMA: Who Said That? Auditing Membership Leakage in Semi-Black-box RAG Controlling

Towards Reasoning-Preserving Unlearning in Multimodal Large Language Models

Shape and Substance: Dual-Layer Side-Channel Attacks on Local Vision-Language Models