The Double-Edged Sword of Data-Driven Super-Resolution: Adversarial Super-Resolution Models

Data-driven super-resolution (SR) methods are often integrated into imaging pipelines as preprocessing steps to improve downstream tasks such as classification and detection. However, these SR models introduce a previously unexplored attack surface into imaging pipelines. In this paper, we present AdvSR, a framework demonstrating that adversarial behavior can be embedded directly into SR model weights during training, requiring no access to inputs at inference time. Unlike prior attacks that perturb inputs or rely on backdoor triggers, AdvSR operates entirely at the model level. By jointly optimizing for reconstruction quality and targeted adversarial outcomes, AdvSR produces models that appear benign under standard image quality metrics while inducing downstream misclassification. We evaluate AdvSR on three SR architectures (SRCNN, EDSR, SwinIR) paired with a YOLOv11 classifier and demonstrate that AdvSR models can achieve high attack success rates with minimal quality degradation. These findings highlight a new model-level threat for imaging pipelines, with implications for how practitioners source and validate models in safety-critical applications.

Key Contributions

AdvSR framework that jointly optimizes SR reconstruction quality and adversarial objectives, embedding targeted misclassification behavior into SR model weights without requiring input perturbations or trigger patterns at inference time
Demonstration that SR models can be poisoned to appear benign under standard image quality metrics while inducing high downstream attack success rates against a YOLOv11 classifier
Evaluation across three SR architectures (SRCNN, EDSR, SwinIR), exposing a new model-level attack surface in imaging pipelines with implications for how practitioners source and validate SR models

🛡️ Threat Analysis

Model Poisoning

AdvSR embeds hidden, targeted adversarial behavior directly into SR model weights during training — the poisoned model appears benign under standard image quality metrics (PSNR/SSIM) but consistently produces outputs that induce targeted downstream misclassification. While it foregoes traditional trigger patterns, the core threat model is model-level poisoning: the attack is injected at training time, hidden from standard inspection, and targets specific downstream outcomes, fitting the model poisoning/trojan threat paradigm.

Details

Domains

vision

Model Types

cnntransformer

Threat Tags

white_boxtraining_timetargeteddigital

Applications

Key Contributions

🛡️ Threat Analysis

Details

Similar Papers

Hardware-Triggered Backdoors

Removing the Trigger, Not the Backdoor: Alternative Triggers and Latent Backdoors

Theory of Minimal Weight Perturbations in Deep Networks and its Applications for Low-Rank Activated Backdoor Attacks

DSBA: Dynamic Stealthy Backdoor Attack with Collaborative Optimization in Self-Supervised Learning

Backdoor Attacks on Prompt-Driven Video Segmentation Foundation Models

DarkHash: A Data-Free Backdoor Attack Against Deep Hashing

Backdoor Poisoning Attack Against Face Spoofing Attack Detection Methods

DF-LoGiT: Data-Free Logic-Gated Backdoor Attacks in Vision Transformers