destroR: Attacking Transfer Models with Obfuscous Examples to Discard Perplexity
Saadat Rafid Ahmed , Rubayet Shareen , Radoan Sharkar , Nazia Hossain , Mansur Mahi , Farig Yousuf Sadeque
Published on arXiv
2511.11309
Input Manipulation Attack
OWASP ML Top 10 — ML01
Key Finding
Demonstrates adversarial attacks on NLP transfer models via obfuscated high-perplexity inputs, with novel inclusion of Bangla-language adversarial examples.
destroR
Novel technique introduced
Advancements in Machine Learning & Neural Networks in recent years have led to widespread implementations of Natural Language Processing across a variety of fields with remarkable success, solving a wide range of complicated problems. However, recent research has shown that machine learning models may be vulnerable in a number of ways, putting both the models and the systems theyre used in at risk. In this paper, we intend to analyze and experiment with the best of existing adversarial attack recipes and create new ones. We concentrated on developing a novel adversarial attack strategy on current state-of-the-art machine learning models by producing ambiguous inputs for the models to confound them and then constructing the path to the future development of the robustness of the models. We will develop adversarial instances with maximum perplexity, utilizing machine learning and deep learning approaches in order to trick the models. In our attack recipe, we will analyze several datasets and focus on creating obfuscous adversary examples to put the models in a state of perplexity, and by including the Bangla Language in the field of adversarial attacks. We strictly uphold utility usage reduction and efficiency throughout our work.
Key Contributions
- Novel adversarial attack recipe ('destroR') for NLP transfer learning models using high-perplexity obfuscated text examples
- Extension of adversarial NLP attacks to the Bangla language, expanding coverage beyond predominantly English-language benchmarks
- Empirical analysis of existing adversarial attack strategies with utility-preserving constraints
🛡️ Threat Analysis
The paper crafts adversarial inputs (obfuscous examples with maximum perplexity) to fool NLP models at inference time — this is classic input manipulation/adversarial example generation, targeting text classification and similar tasks.