destroR: Attacking Transfer Models with Obfuscous Examples to Discard Perplexity

Advancements in Machine Learning & Neural Networks in recent years have led to widespread implementations of Natural Language Processing across a variety of fields with remarkable success, solving a wide range of complicated problems. However, recent research has shown that machine learning models may be vulnerable in a number of ways, putting both the models and the systems theyre used in at risk. In this paper, we intend to analyze and experiment with the best of existing adversarial attack recipes and create new ones. We concentrated on developing a novel adversarial attack strategy on current state-of-the-art machine learning models by producing ambiguous inputs for the models to confound them and then constructing the path to the future development of the robustness of the models. We will develop adversarial instances with maximum perplexity, utilizing machine learning and deep learning approaches in order to trick the models. In our attack recipe, we will analyze several datasets and focus on creating obfuscous adversary examples to put the models in a state of perplexity, and by including the Bangla Language in the field of adversarial attacks. We strictly uphold utility usage reduction and efficiency throughout our work.

Key Contributions

Novel adversarial attack recipe ('destroR') for NLP transfer learning models using high-perplexity obfuscated text examples
Extension of adversarial NLP attacks to the Bangla language, expanding coverage beyond predominantly English-language benchmarks
Empirical analysis of existing adversarial attack strategies with utility-preserving constraints

🛡️ Threat Analysis

Input Manipulation Attack

The paper crafts adversarial inputs (obfuscous examples with maximum perplexity) to fool NLP models at inference time — this is classic input manipulation/adversarial example generation, targeting text classification and similar tasks.

Details

Domains

nlp

Model Types

transformer

Threat Tags

black_boxinference_timetargeteddigital

Applications

2025 0 cit.

Input Manipulation Attack

91%

destroR: Attacking Transfer Models with Obfuscous Examples to Discard Perplexity

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

HogVul: Black-box Adversarial Code Generation Framework Against LM-based Vulnerability Detectors

StegoStylo: Squelching Stylometric Scrutiny through Steganographic Stitching

One Word is Enough: Minimal Adversarial Perturbations for Neural Text Ranking

Adversarial Attacks against Neural Ranking Models via In-Context Learning

PivotAttack: Rethinking the Search Trajectory in Hard-Label Text Attacks via Pivot Words

Potent but Stealthy: Rethink Profile Pollution against Sequential Recommendation via Bi-level Constrained Reinforcement Paradigm

Evaluating the Robustness of a Production Malware Detection System to Transferable Adversarial Attacks

RedHerring Attack: Testing the Reliability of Attack Detection