ML Security Papers

Latest papers

3 papers

defense arXiv Mar 16, 2026 · 21d ago

Training-free Detection of Generated Videos via Spatial-Temporal Likelihoods

Omer Ben Hayun, Roy Betser, Meir Yossef Levi et al. · Technion – Israel Institute of Technology

Training-free detector that scores videos against real-data statistics using spatial-temporal likelihoods to identify AI-generated content

Output Integrity Attack visionmultimodal

PDF Code

defense arXiv Feb 23, 2026 · 6w ago

VALD: Multi-Stage Vision Attack Detection for Efficient LVLM Defense

Nadav Kadvil, Ayellet Tal · Technion – Israel Institute of Technology

Defends LVLMs against adversarial image attacks via two-stage detection and agentic LLM response consolidation without retraining

Input Manipulation Attack Prompt Injection visionnlpmultimodal

PDF

defense arXiv Aug 19, 2025 · Aug 2025

CRISP: Persistent Concept Unlearning via Sparse Autoencoders

Tomer Ashuach, Dana Arad, Aaron Mueller et al. · Technion – Israel Institute of Technology · Boston University +1 more

Permanently removes dangerous LLM knowledge by suppressing sparse autoencoder features via fine-tuning, blocking adversarial bypass of inference-time safety measures

Prompt Injection nlp

PDF Code

Latest papers

Training-free Detection of Generated Videos via Spatial-Temporal Likelihoods

VALD: Multi-Stage Vision Attack Detection for Efficient LVLM Defense

CRISP: Persistent Concept Unlearning via Sparse Autoencoders

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue