Olivera Kotevska

Papers in Database (2)

attack arXiv Mar 19, 2026 · 18d ago

Automated Membership Inference Attacks: Discovering MIA Signal Computations using LLM Agents

Toan Tran, Olivera Kotevska, Li Xiong · Emory University · Oak Ridge National Laboratory

LLM-agent framework that automatically discovers novel membership inference attack strategies, achieving 0.18 AUC improvement over existing MIAs

Membership Inference Attack
PDF
defense arXiv Apr 1, 2026 · 5d ago

SelfGrader: Stable Jailbreak Detection for Large Language Models using Token-Level Logits

Zikai Zhang, Rui Hu, Olivera Kotevska et al. · University of Nevada · Oak Ridge National Laboratory

Detects LLM jailbreak attacks using logit distributions over numerical tokens, achieving 22.66% ASR reduction with minimal overhead

Prompt Injection nlp
PDF