LLM-Driven Feature-Level Adversarial Attacks on Android Malware Detectors
Tianwei Lan , Farid Naït-Abdesselam
Published on arXiv
2512.21404
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
LAMLAD achieves up to 97% attack success rate against three ML-based Android malware detectors, requiring an average of only 3 perturbation attempts per adversarial sample
LAMLAD
Novel technique introduced
The rapid growth in both the scale and complexity of Android malware has driven the widespread adoption of machine learning (ML) techniques for scalable and accurate malware detection. Despite their effectiveness, these models remain vulnerable to adversarial attacks that introduce carefully crafted feature-level perturbations to evade detection while preserving malicious functionality. In this paper, we present LAMLAD, a novel adversarial attack framework that exploits the generative and reasoning capabilities of large language models (LLMs) to bypass ML-based Android malware classifiers. LAMLAD employs a dual-agent architecture composed of an LLM manipulator, which generates realistic and functionality-preserving feature perturbations, and an LLM analyzer, which guides the perturbation process toward successful evasion. To improve efficiency and contextual awareness, LAMLAD integrates retrieval-augmented generation (RAG) into the LLM pipeline. Focusing on Drebin-style feature representations, LAMLAD enables stealthy and high-confidence attacks against widely deployed Android malware detection systems. We evaluate LAMLAD against three representative ML-based Android malware detectors and compare its performance with two state-of-the-art adversarial attack methods. Experimental results demonstrate that LAMLAD achieves an attack success rate (ASR) of up to 97%, requiring on average only three attempts per adversarial sample, highlighting its effectiveness, efficiency, and adaptability in practical adversarial settings. Furthermore, we propose an adversarial training-based defense strategy that reduces the ASR by more than 30% on average, significantly enhancing model robustness against LAMLAD-style attacks.
Key Contributions
- LAMLAD: a dual-agent LLM framework (manipulator + analyzer) that iteratively generates functionality-preserving adversarial feature perturbations to evade Android malware detectors
- Integration of RAG into the adversarial LLM pipeline to improve efficiency and contextual awareness of feature semantics
- Adversarial training defense that reduces LAMLAD's attack success rate by over 30% on average
🛡️ Threat Analysis
LAMLAD crafts adversarial feature-level perturbations (Drebin-style features) at inference time to cause ML-based malware classifiers to misclassify malicious APKs as benign — a direct adversarial evasion attack on ML models.