ML Security Papers

Latest papers

2 papers

defense arXiv Jan 22, 2026 · 10w ago

Joan Vendrell Farreny, Martí Jordà Roca, Miquel Cornudella Gaya et al. · NeuralTrust

Proposes a unified LLM security enforcement layer analogous to WAF, covering prompt injection, jailbreaks, and agent tool abuse

Prompt Injection Insecure Plugin Design nlp

attack arXiv Jan 9, 2026 · 12w ago

Ahmad Alobaid, Martí Jordà Roca, Carlos Castillo et al. · NeuralTrust · ICREA +1 more

Proposes Echo Chamber, a multi-turn LLM jailbreak using gradual escalation via poisonous seeds to bypass safety guardrails

Prompt Injection nlp

1 citations PDF