Latest papers

2 papers
defense arXiv Mar 11, 2026 · 26d ago

Backdoor Directions in Vision Transformers

Sengim Karayalcin, Marina Krcek, Pin-Yu Chen et al. · Leiden University · Radboud University +2 more

Identifies causal 'trigger directions' in ViT activations to analyze, remove, and detect backdoors via weight-space interventions

Model Poisoning vision
PDF
benchmark arXiv Oct 31, 2025 · Oct 2025

EL-MIA: Quantifying Membership Inference Risks of Sensitive Entities in LLMs

Ali Satvaty, Suzan Verberne, Fatih Turkmen · University of Groningen · Leiden University

Benchmarks entity-level membership inference of PII and sensitive data in LLMs, revealing limits of existing MIA methods

Membership Inference Attack nlp
1 citations PDF