defense 2025

TopoReformer: Mitigating Adversarial Attacks Using Topological Purification in OCR Models

Bhagyesh Kumar , A S Aravinthakashan , Akshat Satyanarayan , Ishaan Gakhar , Ujjwal Verma

0 citations · 48 references · arXiv

α

Published on arXiv

2511.15807

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

TopoReformer suppresses adversarial artifacts across FGSM, PGD, C&W, EOT, and BPDA attacks on OCR text images without requiring adversarial training examples, with up to 5% accuracy improvement under C&W attacks.

TopoReformer

Novel technique introduced


Adversarially perturbed images of text can cause sophisticated OCR systems to produce misleading or incorrect transcriptions from seemingly invisible changes to humans. Some of these perturbations even survive physical capture, posing security risks to high-stakes applications such as document processing, license plate recognition, and automated compliance systems. Existing defenses, such as adversarial training, input preprocessing, or post-recognition correction, are often model-specific, computationally expensive, and affect performance on unperturbed inputs while remaining vulnerable to unseen or adaptive attacks. To address these challenges, TopoReformer is introduced, a model-agnostic reformation pipeline that mitigates adversarial perturbations while preserving the structural integrity of text images. Topology studies properties of shapes and spaces that remain unchanged under continuous deformations, focusing on global structures such as connectivity, holes, and loops rather than exact distance. Leveraging these topological features, TopoReformer employs a topological autoencoder to enforce manifold-level consistency in latent space and improve robustness without explicit gradient regularization. The proposed method is benchmarked on EMNIST, MNIST, against standard adversarial attacks (FGSM, PGD, Carlini-Wagner), adaptive attacks (EOT, BDPA), and an OCR-specific watermark attack (FAWA).


Key Contributions

  • Topological autoencoder for adversarial image purification that enforces manifold-level consistency using persistent homology loss, trained solely on clean data
  • Freeze-Flow training paradigm that routes gradients through an auxiliary module to encourage topology-consistent latents, yielding up to 5% classification improvement under C&W attacks
  • Model-agnostic drop-in defense pipeline for OCR systems robust to white-box, black-box, and adaptive attacks (EOT, BPDA) without adversarial retraining

🛡️ Threat Analysis

Input Manipulation Attack

Proposes a defense against adversarial examples (FGSM, PGD, C&W, EOT, BPDA, FAWA) that cause OCR misclassification at inference time — a classic input manipulation attack defense via input purification.


Details

Domains
vision
Model Types
cnntransformer
Threat Tags
white_boxblack_boxinference_timedigitalphysical
Datasets
EMNISTMNIST
Applications
ocr systemsdocument processinglicense plate recognitionautomated compliance systems