Suhang Wang

benchmark arXiv Feb 28, 2026 · 5w ago

A Comprehensive Evaluation of LLM Unlearning Robustness under Multi-Turn Interaction

Ruihao Pan, Suhang Wang · Pennsylvania State University

Shows LLM unlearning fails under multi-turn interaction; self-correction and dialogue history recover supposedly forgotten hazardous or private knowledge

Prompt Injection Sensitive Information Disclosure nlp

PDF

tool arXiv Aug 15, 2025 · Aug 2025

SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis

Haitong Luo, Weiyao Zhang, Suhang Wang et al. · Chinese Academy of Sciences · University of Chinese Academy of Sciences +3 more

Detects LLM-generated text via spectral energy of token log-probability sequences using DFT/STFT, outperforming SOTA at half the runtime

Output Integrity Attack nlp

PDF Code

Papers in Database (2)

A Comprehensive Evaluation of LLM Unlearning Robustness under Multi-Turn Interaction

SpecDetect: Simple, Fast, and Training-Free Detection of LLM-Generated Text via Spectral Analysis