benchmark 2026

Useful nonrobust features are ubiquitous in biomedical images

Coenraad Mouton 1,2, Randle Rabe 2, Niklas C. Koser 1, Nicolai Krekiehn 1, Christopher Hansen 1, Jan-Bernd Hövener 1, Claus-C. Glüer 1

0 citations

α

Published on arXiv

2604.22579

Input Manipulation Attack

OWASP ML Top 10 — ML01

Key Finding

Nonrobust features achieve well-above-chance accuracy in-distribution but adversarially trained robust models yield markedly better performance under controlled distribution shifts (MedMNIST-C)


We study whether deep networks for medical imaging learn useful nonrobust features - predictive input patterns that are not human interpretable and highly susceptible to small adversarial perturbations - and how these features impact test performance. We show that models trained only on nonrobust features achieve well above chance accuracy across five MedMNIST classification tasks, confirming their predictive value in-distribution. Conversely, adversarially trained models that primarily rely on robust features sacrifice in-distribution accuracy but yield markedly better performance under controlled distribution shifts (MedMNIST-C). Overall, nonrobust features boost standard accuracy yet degrade out-of-distribution performance, revealing a practical robustness-accuracy trade-off in medical imaging classification tasks that should be tailored to the requirements of the deployment setting.


Key Contributions

  • First systematic investigation of robust vs nonrobust features across 5 medical imaging modalities (CT, radiography, ultrasound, histopathology)
  • Demonstrates that models trained only on nonrobust features achieve above-chance accuracy in-distribution but degrade under distribution shift
  • Shows adversarially trained models sacrifice standard accuracy for improved OOD robustness, revealing practical robustness-accuracy tradeoff in medical imaging

🛡️ Threat Analysis

Input Manipulation Attack

Paper studies adversarial perturbations and adversarial training in medical imaging classification. Evaluates models' reliance on robust vs nonrobust features using adversarial examples (Eq. 1 defines adversarial perturbations). Demonstrates that nonrobust features are susceptible to adversarial perturbations and that adversarially trained models achieve better robustness. Core contribution is understanding how adversarial robustness affects medical imaging DNNs.


Details

Domains
vision
Model Types
cnn
Threat Tags
inference_timedigital
Datasets
MedMNISTOrganSMNISTMedMNIST-C
Applications
medical imagingradiologydigital pathologyorgan classification