Latest papers

4 papers
benchmark arXiv Feb 23, 2026 · 6w ago

Agents of Chaos

Natalie Shapira, Chris Wendler, Avery Yen et al. · Northeastern University · Independent Researcher +11 more

Red-teams live autonomous LLM agents over two weeks, documenting 11 case studies of dangerous failures including system takeover, DoS, and sensitive data disclosure

Excessive Agency Prompt Injection Insecure Plugin Design nlp
3 citations PDF
tool arXiv Aug 21, 2025 · Aug 2025

PickleBall: Secure Deserialization of Pickle-based Machine Learning Models (Extended Report)

Andreas D. Kellas, Neophytos Christou, Wenxin Jiang et al. · Columbia University · Brown University +4 more

Defends against malicious pickle-based ML models on Hugging Face via static analysis and dynamic policy enforcement at load time

AI Supply Chain Attacks
PDF
defense arXiv Aug 21, 2025 · Aug 2025

Mini-Batch Robustness Verification of Deep Neural Networks

Saar Tzour-Shaday, Dana Drachsler-Cohen · Technion

Batched formal verifier BaVerLy certifies adversarial robustness of ε-ball sets 2.3x faster by grouping similar network computations

Input Manipulation Attack vision
PDF
benchmark arXiv Aug 3, 2025 · Aug 2025

Benchmarking Adversarial Patch Selection and Location

Shai Kimhi, Avi Mendlson, Moshe Kimhi · Technion

Spatially exhaustive adversarial patch placement benchmark (150M+ passes) reveals hot-spots and enables gradient-free attack heuristic boosting ASR 8–13 pp

Input Manipulation Attack vision
PDF Code