Amitava Das

Papers in Database (2)

defense arXiv Aug 4, 2025 · Aug 2025

TRACEALIGN -- Tracing the Drift: Attributing Alignment Failures to Training-Time Belief Sources in LLMs

Amitava Das, Vinija Jain, Aman Chadha · BITS Pilani · Meta AI +1 more

Traces LLM alignment failures to training corpus sources and defends against jailbreaks via inference filters, DPO regularization, and provenance-aware decoding

Prompt Injection nlp
PDF Code
attack arXiv Apr 23, 2026 · 28d ago

PermaFrost-Attack: Stealth Pretraining Seeding(SPS) for planting Logic Landmines During LLM Training

Harsh Kumar, Rahul Maity, Tanmay Joshi et al. · Manipal University Jaipur · National Institute of Technology Karnataka +3 more

Web-scale poisoning attack planting dormant backdoor triggers in LLM pretraining corpora via stealth websites indexed by Common Crawl

Data Poisoning Attack Model Poisoning AI Supply Chain Attacks Training Data Poisoning nlp
PDF Code