Stjepan Picek

Papers in Database (3)

defense arXiv Mar 11, 2026 · 26d ago

Backdoor Directions in Vision Transformers

Sengim Karayalcin, Marina Krcek, Pin-Yu Chen et al. · Leiden University · Radboud University +2 more

Identifies causal 'trigger directions' in ViT activations to analyze, remove, and detect backdoors via weight-space interventions

Model Poisoning vision
PDF
attack arXiv Mar 10, 2026 · 27d ago

Removing the Trigger, Not the Backdoor: Alternative Triggers and Latent Backdoors

Gorka Abad, Ermes Franch, Stefanos Koffas et al. · University of Bergen · Delft University of Technology +2 more

Proves backdoor-trained models stay exploitable via alternative triggers even after defenses neutralize the original training trigger

Model Poisoning vision
PDF
attack arXiv Sep 15, 2025 · Sep 2025

NeuroStrike: Neuron-Level Attacks on Aligned LLMs

Lichao Wu, Sasha Behrouzi, Mohamadreza Rostami et al. · Technical University of Darmstadt · University of Zagreb +1 more

Bypasses LLM safety alignment by pruning <0.6% of sparse safety neurons, achieving 76.9% ASR across 20+ aligned LLMs

Input Manipulation Attack Prompt Injection nlpmultimodal
PDF