attack 2025

Exploring Secure Machine Learning Through Payload Injection and FGSM Attacks on ResNet-50

Umesh Yadav 1, Suman Niroula 2, Gaurav Kumar Gupta , Bicky Yadav 3

2 citations · 20 references · SVCC

α

Published on arXiv

2501.02147

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Payload injection manipulates ResNet-50 predictions in 93.33% of tested samples; FGSM leaves overall accuracy unchanged but measurably increases model confidence in its incorrect predictions.

FGSM

Novel technique introduced


This paper investigates the resilience of a ResNet-50 image classification model under two prominent security threats: Fast Gradient Sign Method (FGSM) adversarial attacks and malicious payload injection. Initially, the model attains a 53.33% accuracy on clean images. When subjected to FGSM perturbations, its overall accuracy remains unchanged; however, the model's confidence in incorrect predictions notably increases. Concurrently, a payload injection scheme is successfully executed in 93.33% of the tested samples, revealing how stealthy attacks can manipulate model predictions without degrading visual quality. These findings underscore the vulnerability of even high-performing neural networks and highlight the urgency of developing more robust defense mechanisms for security-critical applications.


Key Contributions

  • Demonstrates that FGSM perturbations increase model confidence in incorrect predictions even when overall accuracy is unchanged on ResNet-50
  • Implements a payload injection scheme that successfully manipulates ResNet-50 predictions in 93.33% of samples while preserving visual image quality
  • Combines adversarial perturbation and steganographic payload injection in a single evaluation framework to expose dual vulnerabilities of pre-trained CNNs

🛡️ Threat Analysis

Input Manipulation Attack

FGSM is a canonical gradient-based adversarial input attack causing misclassification at inference time; the payload injection attack similarly manipulates model predictions via crafted inputs at inference time without training-set involvement — both are input manipulation attacks on a deployed ResNet-50 classifier.


Details

Domains
vision
Model Types
cnn
Threat Tags
white_boxinference_timedigital
Applications
image classificationfacial recognitionautonomous driving