benchmark arXiv Apr 24, 2026 · 27d ago
Coenraad Mouton, Randle Rabe, Niklas C. Koser et al. · University Hospital Schleswig-Holstein · North-West University
Adversarial training on medical images sacrifices in-distribution accuracy for better OOD robustness by relying on robust rather than nonrobust features
Input Manipulation Attack vision
We study whether deep networks for medical imaging learn useful nonrobust features - predictive input patterns that are not human interpretable and highly susceptible to small adversarial perturbations - and how these features impact test performance. We show that models trained only on nonrobust features achieve well above chance accuracy across five MedMNIST classification tasks, confirming their predictive value in-distribution. Conversely, adversarially trained models that primarily rely on robust features sacrifice in-distribution accuracy but yield markedly better performance under controlled distribution shifts (MedMNIST-C). Overall, nonrobust features boost standard accuracy yet degrade out-of-distribution performance, revealing a practical robustness-accuracy trade-off in medical imaging classification tasks that should be tailored to the requirements of the deployment setting.
cnn University Hospital Schleswig-Holstein · North-West University