Rogerio Abreu de Paula

h-index: 3 102 citations 6 papers (total)

Papers in Database (1)

benchmark arXiv Nov 11, 2025 · Nov 2025

A methodological analysis of prompt perturbations and their effect on attack success rates

Tiago Machado, Maysa Malfiza Garcia de Macedo, Rogerio Abreu de Paula et al. · IBM Research

Statistically analyzes how prompt perturbations shift jailbreak ASR across SFT, DPO, and RLHF-aligned LLMs, exposing benchmark evaluation gaps

Prompt Injection nlp
PDF