Richard J. Young

h-index: 1 2 citations 4 papers (total)

Papers in Database (3)

benchmark arXiv Dec 15, 2025 · Dec 2025

Comparative Analysis of LLM Abliteration Methods: A Cross-Architecture Evaluation

Richard J. Young · University of Nevada Las Vegas

Benchmarks four LLM abliteration tools across 16 models, quantifying safety bypass effectiveness and capability preservation tradeoffs

Prompt Injection nlp
2 citations PDF
benchmark arXiv Nov 27, 2025 · Nov 2025

Evaluating the Robustness of Large Language Model Safety Guardrails Against Adversarial Attacks

Richard J. Young · University of Nevada Las Vegas

Benchmarks 10 LLM safety guardrails against 21 jailbreak categories, exposing benchmark contamination and a novel 'helpful mode' bypass

Prompt Injection nlp
PDF
benchmark arXiv Dec 8, 2025 · Dec 2025

Replicating TEMPEST at Scale: Multi-Turn Adversarial Attacks Against Trillion-Parameter Frontier Models

Richard Young · University of Nevada Las Vegas

Multi-turn TEMPEST jailbreaks achieve 96–100% ASR on six frontier LLMs; extended reasoning mode halves attack success

Prompt Injection nlp
PDF