ML Security Papers

ML Security Papers

Latest papers

1 papers

tool arXiv Nov 16, 2025 · Nov 2025

SGuard-v1: Safety Guardrail for Large Language Models

JoonHo Lee, HyeonMin Cho, Jaewoong Yun et al. · Samsung SDS

Deploys dual-component LLM guardrail covering 60 jailbreak attack types and harmful content using a lightweight 2B-parameter model

Prompt Injection nlp

1 citations PDF Code