attack 2025

Unveiling Unicode's Unseen Underpinnings in Undermining Authorship Attribution

Robert Dilworth

0 citations

α

Published on arXiv

2508.15840

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Theoretically proposes that layering Unicode steganographic embeddings with adversarial stylometric obfuscation can enhance evasion of authorship attribution systems, though empirical validation is deferred to future work.

Unicode Zero-Width Character Adversarial Stylometry

Novel technique introduced


When using a public communication channel--whether formal or informal, such as commenting or posting on social media--end users have no expectation of privacy: they compose a message and broadcast it for the world to see. Even if an end user takes utmost precautions to anonymize their online presence--using an alias or pseudonym; masking their IP address; spoofing their geolocation; concealing their operating system and user agent; deploying encryption; registering with a disposable phone number or email; disabling non-essential settings; revoking permissions; and blocking cookies and fingerprinting--one obvious element still lingers: the message itself. Assuming they avoid lapses in judgment or accidental self-exposure, there should be little evidence to validate their actual identity, right? Wrong. The content of their message--necessarily open for public consumption--exposes an attack vector: stylometric analysis, or author profiling. In this paper, we dissect the technique of stylometry, discuss an antithetical counter-strategy in adversarial stylometry, and devise enhancements through Unicode steganography.


Key Contributions

  • Framework combining adversarial stylometry (imitation, machine translation, obfuscation) with Unicode zero-width character steganography to defeat authorship attribution systems
  • Theoretical analysis of the trade-offs between steganographic embedding and stylometric evasion effectiveness
  • Identification of Unicode steganography as a novel enhancement layer for adversarial stylometry pipelines

🛡️ Threat Analysis

Input Manipulation Attack

Adversarial stylometry is an inference-time evasion attack against NLP authorship attribution classifiers — the paper proposes techniques (Unicode steganography, author imitation, iterative machine translation, obfuscation) to craft text inputs that fool stylometric ML systems into misattributing authorship.


Details

Domains
nlp
Model Types
traditional_mltransformer
Threat Tags
inference_timedigital
Applications
authorship attributionstylometric analysisauthor profiling