ML Security Papers

Latest papers

10 papers

attack arXiv Mar 26, 2026 · 11d ago

Shape and Substance: Dual-Layer Side-Channel Attacks on Local Vision-Language Models

Eyal Hadad, Mordechai Guri · Ben-Gurion University of the Negev

Side-channel attack extracting image geometry and semantic content from local VLMs via timing and cache contention analysis

Output Integrity Attack Sensitive Information Disclosure multimodalvision

PDF

defense arXiv Feb 1, 2026 · 9w ago

Step-Wise Refusal Dynamics in Autoregressive and Diffusion Language Models

Eliron Rahimi, Elad Hirshel, Rom Himelstein et al. · Technion - Israel Institute of Technology · Ben-Gurion University of the Negev +1 more

Defends AR and diffusion LLMs against jailbreaks via SRI signal detecting incomplete internal recovery with 100× lower overhead

Prompt Injection nlp

PDF Code

defense arXiv Jan 31, 2026 · 9w ago

Provably Protecting Fine-Tuned LLMs from Training Data Extraction

Tom Segal, Asaf Shabtai, Yuval Elovici · Ben-Gurion University of the Negev

Defends fine-tuned LLMs against training data extraction with provable Near Access Freeness guarantees and no utility loss

Model Inversion Attack Sensitive Information Disclosure nlp

PDF

survey arXiv Jan 14, 2026 · 11w ago

The Promptware Kill Chain: How Prompt Injections Gradually Evolved Into a Multistep Malware Delivery Mechanism

Oleg Brodt, Elad Feldman, Bruce Schneier et al. · Ben-Gurion University of the Negev · Tel Aviv University +2 more

Surveys 36 LLM attack incidents and proposes a seven-stage promptware kill chain mapping prompt injection to multi-step malware delivery

Prompt Injection Excessive Agency nlp

PDF

defense arXiv Dec 23, 2025 · Dec 2025

Bridging Efficiency and Safety: Formal Verification of Neural Networks with Early Exits

Yizhak Yisrael Elboher, Avraham Raviv, Amihay Elboher et al. · The Hebrew University of Jerusalem · Bar Ilan University +2 more

Formal verification framework for early exit neural networks that certifies local robustness and improves verification efficiency

Input Manipulation Attack visionnlp

1 citations PDF

attack arXiv Dec 23, 2025 · Dec 2025

Real-World Adversarial Attacks on RF-Based Drone Detectors

Omer Gazit, Yael Itzhakev, Yuval Elovici et al. · Ben-Gurion University of the Negev

First physical adversarial attack on RF drone detectors via OTA I/Q waveforms that fool YOLO/Faster R-CNN spectrogram object detection

Input Manipulation Attack vision

PDF

benchmark arXiv Sep 25, 2025 · Sep 2025

No Prior, No Leakage: Revisiting Reconstruction Attacks in Trained Neural Networks

Yehonatan Refael, Guy Smorodinsky, Ofir Lindenbaum et al. · Tel Aviv University · Ben-Gurion University of the Negev +1 more

Theoretically proves reconstruction attacks on neural networks are fundamentally unreliable without prior data knowledge, and that better-trained models leak less

Model Inversion Attack vision

PDF

attack arXiv Sep 16, 2025 · Sep 2025

MIA-EPT: Membership Inference Attack via Error Prediction for Tabular Data

Eyal German, Daniel Samira, Yuval Elovici et al. · Ben-Gurion University of the Negev

Black-box membership inference attack on tabular diffusion models using attribute masking and reconstruction error signals

Membership Inference Attack tabulargenerative

PDF Code

attack NDSS Aug 24, 2025 · Aug 2025

Trust Me, I Know This Function: Hijacking LLM Static Analysis using Bias

Shir Bernstein, David Beste, Daniel Ayzenshteyn et al. · Ben-Gurion University of the Negev · CISPA Helmholtz Center for Information Security

Adversarial code inputs exploiting LLM pattern-recognition bias to hijack static analysis and hide bugs from code-reviewing LLMs

Input Manipulation Attack Prompt Injection nlp

PDF

tool arXiv Aug 24, 2025 · Aug 2025

FRAME : Comprehensive Risk Assessment Framework for Adversarial Machine Learning Threats

Avishag Shapira, Simon Shigol, Asaf Shabtai · Ben-Gurion University of the Negev

Automated AML risk assessment tool that scores threat feasibility across adversarial attack types for real-world ML deployments using LLM-assisted customization

Input Manipulation Attack Data Poisoning Attack visionnlptabular

PDF

Latest papers

Shape and Substance: Dual-Layer Side-Channel Attacks on Local Vision-Language Models

Step-Wise Refusal Dynamics in Autoregressive and Diffusion Language Models

Provably Protecting Fine-Tuned LLMs from Training Data Extraction

The Promptware Kill Chain: How Prompt Injections Gradually Evolved Into a Multistep Malware Delivery Mechanism

Bridging Efficiency and Safety: Formal Verification of Neural Networks with Early Exits

Real-World Adversarial Attacks on RF-Based Drone Detectors

No Prior, No Leakage: Revisiting Reconstruction Attacks in Trained Neural Networks

MIA-EPT: Membership Inference Attack via Error Prediction for Tabular Data

Trust Me, I Know This Function: Hijacking LLM Static Analysis using Bias

FRAME : Comprehensive Risk Assessment Framework for Adversarial Machine Learning Threats

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue