Latest papers

2 papers
defense arXiv Aug 27, 2025 · Aug 2025

Data Cartography for Detecting Memorization Hotspots and Guiding Data Interventions in Generative Models

Laksh Patel, Neel Shanbhag · Illinois Mathematics and Science Academy

Defends generative models against training data extraction by scoring and pruning high-memorization examples at pretraining time

Model Inversion Attack Sensitive Information Disclosure nlpgenerative
PDF
benchmark ICDMW Aug 11, 2025 · Aug 2025

Signature vs. Substance: Evaluating the Balance of Adversarial Resistance and Linguistic Quality in Watermarking Large Language Models

William Guo, Adaku Uchendu, Ana Smith · Illinois Mathematics and Science Academy · MIT Lincoln Laboratory

Benchmarks LLM text watermark robustness against paraphrasing and back-translation attacks using linguistic quality metrics

Output Integrity Attack nlp
PDF