Latest papers

10 papers
defense arXiv Jan 9, 2026 · 12w ago

Why LoRA Fails to Forget: Regularized Low-Rank Adaptation Against Backdoors in Language Models

Hoang-Chau Luong, Lingwei Chen · Rochester Institute of Technology

Defends LLMs against backdoor retention during LoRA fine-tuning via spectral analysis and regularized low-rank adaptation

Model Poisoning Transfer Learning Attack nlp
1 citations PDF
defense International Conference on Ap... Jan 3, 2026 · Jan 2026

dataRLsec: Safety, Security, and Reliability With Robust Offline Reinforcement Learning for DPAs

Shriram KS Pandian, Naresh Kshetri · Rochester Institute of Technology

Proposes robust offline RL defense using density-ratio weighted behavioral cloning against data poisoning attacks in MuJoCo environments

Data Poisoning Attack reinforcement-learningfederated-learning
PDF
defense arXiv Dec 3, 2025 · Dec 2025

Open Set Face Forgery Detection via Dual-Level Evidence Collection

Zhongyi Cai, Bryce Gernon, Wentao Bao et al. · Michigan State University · Rochester Institute of Technology

Proposes dual-level evidential uncertainty estimation to detect novel, unseen face forgery categories in open-set settings

Output Integrity Attack vision
PDF
attack arXiv Nov 16, 2025 · Nov 2025

ToxSearch: Evolving Prompts for Toxicity Search in Large Language Models

Onkar Shelar, Travis Desell · Rochester Institute of Technology

Evolutionary black-box framework that mutates prompts via lexical and semantic operators to elicit toxic LLM outputs and tests cross-model transfer

Prompt Injection nlp
PDF
defense arXiv Nov 12, 2025 · Nov 2025

Robust Watermarking on Gradient Boosting Decision Trees

Jun Woo Chung, Yingjie Lao, Weijie Zhao · Rochester Institute of Technology · Tufts University

Embeds robust ownership watermarks into GBDT models via in-place fine-tuning across four injection strategies

Model Theft tabular
PDF Code
benchmark arXiv Oct 15, 2025 · Oct 2025

Towards Adversarial Robustness and Uncertainty Quantification in DINOv2-based Few-Shot Anomaly Detection

Akib Mohammed Khan, Bartosz Krawczyk · Rochester Institute of Technology

Benchmarks FGSM adversarial vulnerability of DINOv2 anomaly detectors; Platt-scaled uncertainty flags attacks via elevated predictive entropy

Input Manipulation Attack vision
PDF
attack arXiv Oct 7, 2025 · Oct 2025

Geometry-Aware Backdoor Attacks: Leveraging Curvature in Hyperbolic Embeddings

Ali Baheri · Rochester Institute of Technology

Backdoor attack exploiting hyperbolic geometry's boundary-driven asymmetry to evade standard detectors in non-Euclidean models

Model Poisoning graph
PDF
defense arXiv Oct 5, 2025 · Oct 2025

Concept-Based Masking: A Patch-Agnostic Defense Against Adversarial Patch Attacks

Ayushi Mehrotra, Derek Peng, Dipkamal Bhusal et al. · California Institute of Technology · University of California +1 more

Defends against adversarial patches by masking top concept activation vectors, requiring no prior knowledge of patch size or location

Input Manipulation Attack vision
PDF Code
defense arXiv Oct 1, 2025 · Oct 2025

Density-Ratio Weighted Behavioral Cloning: Learning Control Policies from Corrupted Datasets

Shriram Karpoora Sundara Pandian, Ali Baheri · Rochester Institute of Technology

Defends offline RL behavioral cloning from poisoned training datasets using discriminator-based trajectory reweighting without knowing the attack type

Data Poisoning Attack reinforcement-learning
PDF
attack arXiv Sep 18, 2025 · Sep 2025

Discrete optimal transport is a strong audio adversarial attack

Anton Selitskiy, Akib Shahriyar, Jishnuraj Prakasan · University of Rochester · Rochester Institute of Technology

Attacks audio anti-spoofing ML classifiers via optimal transport distributional alignment without gradient access

Input Manipulation Attack audiogenerative
PDF