Latest papers

3 papers
survey arXiv Feb 6, 2026 · 8w ago

Trojans in Artificial Intelligence (TrojAI) Final Report

Kristopher W. Reese, Taylor Kulp-McDowall, Michael Majurski et al. · IARPA · NIST +13 more

Surveys IARPA TrojAI program findings on AI backdoor detection via weight analysis and trigger inversion across multi-year research

Model Poisoning visionnlp
PDF
defense arXiv Sep 17, 2025 · Sep 2025

Privacy Preserving In-Context-Learning Framework for Large Language Models

Bishnu Bhusal, Manoj Acharya, Ramneet Kaur et al. · University of Missouri · SRI International

Defends private in-context learning by applying differential privacy to aggregated token distributions, preventing adversarial extraction of sensitive prompt data

Sensitive Information Disclosure nlp
PDF Code
survey arXiv Sep 12, 2025 · Sep 2025

LLM in the Middle: A Systematic Review of Threats and Mitigations to Real-World LLM-based Systems

Vitor Hugo Galhardo Moia, Igor Jochem Sanz, Gabriel Antonio Fontes Rebello et al. · Instituto de Pesquisas Eldorado · SRI International

Systematic survey of threats and defenses across the full LLM-based system lifecycle, from training to deployment

Data Poisoning Attack AI Supply Chain Attacks Prompt Injection Sensitive Information Disclosure Insecure Plugin Design nlp
PDF