benchmark 2025

Empirical evaluation of the Frank-Wolfe methods for constructing white-box adversarial attacks

Kristina Korotkova 1, Aleksandr Katrutsa 2

0 citations · 44 references · arXiv

α

Published on arXiv

2512.10936

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Frank-Wolfe projection-free methods outperform projection-based baselines (PGD, FGSM) under l1 constraints by exploiting sparse solution structure, while offering competitive performance under l2 and l-inf norms across logistic regression, CNN, and ViT architectures.

Frank-Wolfe adversarial attack

Novel technique introduced


The construction of adversarial attacks for neural networks appears to be a crucial challenge for their deployment in various services. To estimate the adversarial robustness of a neural network, a fast and efficient approach is needed to construct adversarial attacks. Since the formalization of adversarial attack construction involves solving a specific optimization problem, we consider the problem of constructing an efficient and effective adversarial attack from a numerical optimization perspective. Specifically, we suggest utilizing advanced projection-free methods, known as modified Frank-Wolfe methods, to construct white-box adversarial attacks on the given input data. We perform a theoretical and numerical evaluation of these methods and compare them with standard approaches based on projection operations or geometrical intuition. Numerical experiments are performed on the MNIST and CIFAR-10 datasets, utilizing a multiclass logistic regression model, the convolutional neural networks (CNNs), and the Vision Transformer (ViT).


Key Contributions

  • Systematic empirical evaluation of advanced Frank-Wolfe projection-free variants for adversarial example generation under l1, l2, and l-inf constraints
  • Theoretical and numerical comparison of projection-free methods against projection-based baselines (FGSM, PGD) across norm types
  • Analysis of sparsity properties of resulting adversarial perturbations and practical recommendations per norm/model class

🛡️ Threat Analysis

Input Manipulation Attack

Paper focuses on constructing white-box adversarial perturbations at inference time using gradient-based Frank-Wolfe optimization methods, directly attacking image classifiers by maximizing cross-entropy loss under norm-ball constraints — a canonical input manipulation attack.


Details

Domains
vision
Model Types
cnntransformertraditional_ml
Threat Tags
white_boxinference_timeuntargeteddigital
Datasets
MNISTCIFAR-10
Applications
image classification