DeepTrust: Multi-Step Classification through Dissimilar Adversarial Representations for Robust Android Malware Detection
Daniel Pulido-Cortázar , Daniel Gibert , Felip Manyà
Published on arXiv
2510.12310
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
Outperforms the next-best competitor by up to 266% under feature-space evasion attacks and 217% under problem-space evasion attacks while maintaining FPR below 1% on the ELSA-RAMD benchmark.
DeepTrust
Novel technique introduced
Over the last decade, machine learning has been extensively applied to identify malicious Android applications. However, such approaches remain vulnerable against adversarial examples, i.e., examples that are subtly manipulated to fool a machine learning model into making incorrect predictions. This research presents DeepTrust, a novel metaheuristic that arranges flexible classifiers, like deep neural networks, into an ordered sequence where the final decision is made by a single internal model based on conditions activated in cascade. In the Robust Android Malware Detection competition at the 2025 IEEE Conference SaTML, DeepTrust secured the first place and achieved state-of-the-art results, outperforming the next-best competitor by up to 266% under feature-space evasion attacks. This is accomplished while maintaining the highest detection rate on non-adversarial malware and a false positive rate below 1%. The method's efficacy stems from maximizing the divergence of the learned representations among the internal models. By using classifiers inducing fundamentally dissimilar embeddings of the data, the decision space becomes unpredictable for an attacker. This frustrates the iterative perturbation process inherent to evasion attacks, enhancing system robustness without compromising accuracy on clean examples.
Key Contributions
- DeepTrust metaheuristic that arranges classifiers in an ordered cascade where a single internal model makes the final decision based on activation conditions, rather than output aggregation
- Principle of maximizing representation divergence among internal classifiers to make the decision space unpredictable for attackers running iterative perturbation attacks
- First-place finish in the ELSA-RAMD competition at IEEE SaTML 2025, outperforming the next-best competitor by up to 266% under feature-space evasion attacks with FPR below 1%
🛡️ Threat Analysis
DeepTrust directly defends against feature-space and problem-space evasion attacks at inference time — adversarial inputs crafted to fool malware detectors. The defense works by frustrating the iterative perturbation process inherent to these attacks through maximizing representation divergence across cascaded classifiers.