Latest papers

10 papers
attack arXiv Apr 1, 2026 · 7d ago

G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs

Ravi Ranjan, Utkarsh Grover, Xiaomin Lin et al. · Florida International University · University of South Florida

White-box membership inference attack using gradient-induced feature drift, outperforming confidence-based and reference-based MIAs on LLMs

Membership Inference Attack nlp
PDF
defense arXiv Mar 15, 2026 · 24d ago

Relationship-Aware Safety Unlearning for Multimodal LLMs

Vishnu Narayanan Anilkumar, Abhijith Sreesylesh Babu, Trieu Hai Vo et al. · Florida International University

Unlearns unsafe object-relation-object tuples in multimodal LLMs using LoRA while preserving safe contexts and benign uses

Prompt Injection multimodalnlp
PDF
attack arXiv Jan 27, 2026 · 10w ago

What Hard Tokens Reveal: Exploiting Low-confidence Tokens for Membership Inference Attacks against Large Language Models

Md Tasnim Jawad, Mingyan Xiao, Yanzhao Wu · Florida International University · California State Polytechnic University

Novel token-level MIA on LLMs exploiting hard-token probability gaps between fine-tuned and reference models to outperform 7 baselines

Membership Inference Attack nlp
PDF
defense arXiv Jan 8, 2026 · Jan 2026

Multi-turn Jailbreaking Attack in Multi-Modal Large Language Models

Badhan Chandra Das, Md Tasnim Jawad, Joaquin Molto et al. · Florida International University

Proposes multi-turn jailbreaking attacks on MLLMs and FragGuard defense to mitigate them without fine-tuning

Prompt Injection nlpmultimodal
1 citations PDF
attack arXiv Nov 17, 2025 · Nov 2025

Jailbreaking Large Vision Language Models in Intelligent Transportation Systems

Badhan Chandra Das, Md Tasnim Jawad, Md Jueal Mia et al. · Florida International University

Jailbreaks LVLMs in transportation contexts using image typography tricks and multi-turn prompting, plus a filtering-based defense

Prompt Injection multimodalvisionnlp
PDF
defense arXiv Nov 13, 2025 · Nov 2025

CertMask: Certifiable Defense Against Adversarial Patches via Theoretically Optimal Mask Coverage

Xuntao Lyu, Ching-Chi Lin, Abdullah Al Arafat et al. · North Carolina State University · Technische Universität Dortmund +2 more

Certified defense against adversarial patches using k-fold mask coverage, cutting inference cost from O(n²) to O(n) while improving certified accuracy by +13.4%

Input Manipulation Attack vision
PDF
defense arXiv Oct 27, 2025 · Oct 2025

PRO: Enabling Precise and Robust Text Watermark for Open-Source LLMs

Jiaqi Xue, Yifei Zhao, Mansour Al Ghanim et al. · University of Central Florida · Florida State University +1 more

Embeds robust text watermarks into open-source LLM weights to detect AI-generated content even after fine-tuning or model merging

Output Integrity Attack nlp
PDF
defense First International Conference... Oct 22, 2025 · Oct 2025

SecureInfer: Heterogeneous TEE-GPU Architecture for Privacy-Critical Tensors for Large Language Model Deployment

Tushar Nayan, Ziqi Zhang, Ruimin Sun · Florida International University · University of Illinois Urbana-Champaign

Defends LLM weights from extraction attacks by isolating security-critical layers in SGX enclaves while offloading matrix ops to GPU

Model Theft Model Theft nlp
1 citations PDF
attack arXiv Sep 24, 2025 · Sep 2025

JaiLIP: Jailbreaking Vision-Language Models via Loss Guided Image Perturbation

Md Jueal Mia, M. Hadi Amini · Florida International University

Gradient-optimized adversarial image perturbations that jailbreak VLMs by jointly minimizing MSE and harmful-output loss

Input Manipulation Attack Prompt Injection visionmultimodalnlp
PDF
attack arXiv Sep 7, 2025 · Sep 2025

Uncovering the Vulnerability of Large Language Models in the Financial Domain via Risk Concealment

Gang Cheng, Haibo Jin, Wenbin Zhang et al. · University of Illinois Urbana-Champaign · Florida International University +1 more

Multi-turn jailbreak attack conceals financial regulatory risks across turns to bypass LLM safety filters, achieving 93% average ASR

Prompt Injection nlp
PDF Code