Rui Abreu

h-index: 1 2 citations 3 papers (total)

Papers in Database (1)

defense arXiv Oct 1, 2025 · Oct 2025

Microsaccade-Inspired Probing: Positional Encoding Perturbations Reveal LLM Misbehaviours

Rui Melo, Rui Abreu, Corina S. Pasareanu · Carnegie Mellon University · FEUP +1 more

Positional encoding perturbations probe LLM internals to detect safety violations, toxicity, and backdoor attacks without fine-tuning

Model Poisoning Prompt Injection nlp
1 citations PDF