Zhiyang Chen

Papers in Database (1)

tool arXiv Sep 2, 2025 · Sep 2025

Scam2Prompt: A Scalable Framework for Auditing Malicious Scam Endpoints in Production LLMs

Zhiyang Chen, Tara Saba, Xun Deng et al. · University of Toronto

Auditing framework exposes production LLMs reproducing memorized scam URLs via innocuous prompts, with guardrails detecting under 0.3% of cases

Data Poisoning Attack Training Data Poisoning Sensitive Information Disclosure nlp
PDF