ML Security Papers

benchmark arXiv Feb 23, 2026 · 12w ago

Red-Teaming Claude Opus and ChatGPT-based Security Advisors for Trusted Execution Environments

Kunal Mukherjee · Virginia Tech

Red-teams Claude Opus and ChatGPT as TEE security advisors, finding transferable prompt-induced failures and proposing an evaluation benchmark

Prompt Injection Benchmarks & Evaluation Triage & Prioritization nlp

1 citations PDF

tool arXiv Nov 24, 2025 · Nov 2025

AttackPilot: Autonomous Inference Attacks Against ML Services With LLM-Based Agents

Yixin Wu, Rui Wen, Chi Cui et al. · CISPA Helmholtz Center for Information Security · Institute of Science Tokyo

Autonomous LLM agent automates membership inference, model stealing, and data reconstruction attacks on ML services with near-expert accuracy at $0.627/run.

Membership Inference Attack Model Theft Model Inversion Attack Red-Team Agents Triage & Prioritization nlp

PDF

Latest papers

Red-Teaming Claude Opus and ChatGPT-based Security Advisors for Trusted Execution Environments

AttackPilot: Autonomous Inference Attacks Against ML Services With LLM-Based Agents

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue