defense 2025

DeepTrust: Multi-Step Classification through Dissimilar Adversarial Representations for Robust Android Malware Detection

Daniel Pulido-Cortázar , Daniel Gibert , Felip Manyà

0 citations · 45 references · arXiv

α

Published on arXiv

2510.12310

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Outperforms the next-best competitor by up to 266% under feature-space evasion attacks and 217% under problem-space evasion attacks while maintaining FPR below 1% on the ELSA-RAMD benchmark.

DeepTrust

Novel technique introduced


Over the last decade, machine learning has been extensively applied to identify malicious Android applications. However, such approaches remain vulnerable against adversarial examples, i.e., examples that are subtly manipulated to fool a machine learning model into making incorrect predictions. This research presents DeepTrust, a novel metaheuristic that arranges flexible classifiers, like deep neural networks, into an ordered sequence where the final decision is made by a single internal model based on conditions activated in cascade. In the Robust Android Malware Detection competition at the 2025 IEEE Conference SaTML, DeepTrust secured the first place and achieved state-of-the-art results, outperforming the next-best competitor by up to 266% under feature-space evasion attacks. This is accomplished while maintaining the highest detection rate on non-adversarial malware and a false positive rate below 1%. The method's efficacy stems from maximizing the divergence of the learned representations among the internal models. By using classifiers inducing fundamentally dissimilar embeddings of the data, the decision space becomes unpredictable for an attacker. This frustrates the iterative perturbation process inherent to evasion attacks, enhancing system robustness without compromising accuracy on clean examples.


Key Contributions

  • DeepTrust metaheuristic that arranges classifiers in an ordered cascade where a single internal model makes the final decision based on activation conditions, rather than output aggregation
  • Principle of maximizing representation divergence among internal classifiers to make the decision space unpredictable for attackers running iterative perturbation attacks
  • First-place finish in the ELSA-RAMD competition at IEEE SaTML 2025, outperforming the next-best competitor by up to 266% under feature-space evasion attacks with FPR below 1%

🛡️ Threat Analysis

Input Manipulation Attack

DeepTrust directly defends against feature-space and problem-space evasion attacks at inference time — adversarial inputs crafted to fool malware detectors. The defense works by frustrating the iterative perturbation process inherent to these attacks through maximizing representation divergence across cascaded classifiers.


Details

Domains
tabular
Model Types
cnntraditional_ml
Threat Tags
inference_timeblack_box
Datasets
ELSA-RAMD
Applications
android malware detection