ML Security Papers

ML Security Papers

Latest papers

1 papers

benchmark arXiv Dec 30, 2025 · Dec 2025

The Silicon Psyche: Anthropomorphic Vulnerabilities in Large Language Models

Giuseppe Canale, Kashyap Thimmaraju · CPF3.org · Flowguard Institute

Proposes a benchmark framework exposing LLMs to human-style social engineering attacks via authority, urgency, and social proof manipulation

Prompt Injection Excessive Agency nlp