ML Security Papers

Latest papers

226 papers

benchmark arXiv Apr 26, 2026 · 25d ago

LLM-CEG: Extending the Classification Error Gauge Framework for Privacy Auditing of Large Language Models

Kato Mivule · Bowie State University

Privacy auditing framework for LLMs measuring membership inference attack resistance and utility trade-offs under differential privacy

Membership Inference Attack Sensitive Information Disclosure nlp

PDF

attack arXiv Apr 23, 2026 · 28d ago

Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach

Guilin Deng, Silong Chen, Yuchuan Luo et al. · National University of Defense Technology · City University of Hong Kong +1 more

Gradient-based membership inference attack on federated LLMs achieving near-perfect accuracy via projection residual analysis

Membership Inference Attack nlpfederated-learning

PDF Code

defense arXiv Apr 22, 2026 · 29d ago

Adaptive Defense Orchestration for RAG: A Sentinel-Strategist Architecture against Multi-Vector Attacks

Pranav Pallerla, Wilson Naik Bhukya, Bharath Vemula et al. · University of Hyderabad · Purdue University

Adaptive defense orchestration for RAG systems that selectively activates protections based on query risk, reducing utility cost while defending against membership inference and data poisoning

Membership Inference Attack Data Poisoning Attack Sensitive Information Disclosure nlp

PDF

attack arXiv Apr 21, 2026 · 4w ago

A Data-Free Membership Inference Attack on Federated Learning in Hardware Assurance

Gijung Lee, Wavid Bowman, Olivia P. Dizon-Paradis et al. · University of Florida

Data-free gradient inversion attack on federated learning that reconstructs hardware circuit images to infer sensitive IP characteristics

Model Inversion Attack Membership Inference Attack visionfederated-learning

PDF

attack arXiv Apr 21, 2026 · 4w ago

A Dual Perspective on Synthetic Trajectory Generators: Utility Framework and Privacy Vulnerabilities

Aya Cherigui, Florent Guépin, Arnaud Legendre et al. · Orange Research · Université Marie et Louis Pasteur

Membership inference attack against synthetic trajectory generators, demonstrating privacy vulnerabilities despite resistance to trajectory user-linking

Membership Inference Attack tabulargenerative

PDF

benchmark arXiv Apr 21, 2026 · 4w ago

Detecting Data Contamination in Large Language Models

Juliusz Janicki, Savvas Chamezopoulos, Evangelos Kanoulas et al. · University of Amsterdam · Elsevier

Benchmarks black-box membership inference attacks on state-of-the-art LLMs, finding none reliably detect training data membership

Membership Inference Attack nlp

PDF

defense arXiv Apr 21, 2026 · 4w ago

Generalization and Membership Inference Attack a Practical Perspective

Fateme Rahmani, Mahdi Jafari Siavoshani, Mohammad Hossein Rohban · Sharif University of Technology

Empirical study showing advanced generalization techniques (augmentation, early stopping) reduce membership inference attack success by up to 100×

Membership Inference Attack vision

PDF

attack arXiv Apr 21, 2026 · 4w ago

DECIFR: Domain-Aware Exfiltration of Circuit Information from Federated Gradient Reconstruction

Gijung Lee, Wavid Bowman, Olivia P. Dizon-Paradis et al. · University of Florida

Membership inference attack on federated learning IC segmentation models using gradient inversion guided by standard cell library layouts

Membership Inference Attack Model Inversion Attack visionfederated-learning

PDF

defense arXiv Apr 17, 2026 · 4w ago

CiPO: Counterfactual Unlearning for Large Reasoning Models through Iterative Preference Optimization

Junyi Li, Yongqiang Chen, Ningning Ding · The Hong Kong University of Science and Technology · The Chinese University of Hong Kong

Unlearns knowledge from reasoning model CoT traces via iterative preference optimization, evaluated against membership inference attacks

Membership Inference Attack nlp

PDF Code

attack arXiv Apr 14, 2026 · 5w ago

Evaluating Differential Privacy Against Membership Inference in Federated Learning: Insights from the NIST Genomics Red Team Challenge

Gustavo de Carvalho Bertoli

Stacking-based membership inference attack against FL models revealing DP limitations at ε=200 in genomic data

Membership Inference Attack federated-learningtabular

PDF

attack arXiv Apr 14, 2026 · 5w ago

CoLA: A Choice Leakage Attack Framework to Expose Privacy Risks in Subset Training

Qi Li, Cheng-Long Wang, Yinzhi Cao et al. · King Abdullah University of Science and Technology · National University of Singapore +1 more

Membership inference attacks on subset-trained models revealing both training membership and selection participation across data pipelines

Membership Inference Attack visionnlp

PDF

attack arXiv Apr 12, 2026 · 5w ago

Membership Inference Attacks Expose Participation Privacy in ECG Foundation Encoders

Ziyu Wang, Elahe Khatibi, Ankita Sharma et al. · University of California · Arizona State University +1 more

Audits membership inference attacks on ECG foundation encoders, finding participation leakage through embeddings and scores under realistic access models

Membership Inference Attack timeseries

PDF

Foundation-style ECG encoders pretrained with self-supervised learning are increasingly reused across tasks, institutions, and deployment contexts, often through model-as-a-service interfaces that expose scalar scores or latent representations. While such reuse improves data efficiency and generalization, it raises a participation privacy concern: can an adversary infer whether a specific individual or cohort contributed ECG data to pretraining, even when raw waveforms and diagnostic labels are never disclosed? In connected-health settings, training participation itself may reveal institutional affiliation, study enrollment, or sensitive health context. We present an implementation-grounded audit of membership inference attacks (MIAs) against modern self-supervised ECG foundation encoders, covering contrastive objectives (SimCLR, TS2Vec) and masked reconstruction objectives (CNN- and Transformer-based MAE). We evaluate three realistic attacker interfaces: (i) score-only black-box access to scalar outputs, (ii) adaptive learned attackers that aggregate subject-level statistics across repeated queries, and (iii) embedding-access attackers that probe latent representation geometry. Using a subject-centric protocol with window-to-subject aggregation and calibration at fixed false-positive rates under a cross-dataset auditing setting, we observe heterogeneous and objective-dependent participation leakage: leakage is most pronounced in small or institution-specific cohorts and, for contrastive encoders, can saturate in embedding space, while larger and more diverse datasets substantially attenuate operational tail risk. Overall, our results show that restricting access to raw signals or labels is insufficient to guarantee participation privacy, underscoring the need for deployment-aware auditing of reusable biosignal foundation encoders in connected-health systems.

cnn transformer University of California · Arizona State University · California State University

PDF arXiv

defense arXiv Apr 9, 2026 · 6w ago

TADP-RME: A Trust-Adaptive Differential Privacy Framework for Enhancing Reliability of Data-Driven Systems

Labani Halder, Payel Sadhukhan, Sarbani Palit · Indian Statistical Institute · Army Institute of Management

Trust-adaptive differential privacy with geometric transformation to defend against inference attacks on ML training data

Model Inversion Attack Membership Inference Attack tabular

PDF

attack arXiv Apr 3, 2026 · 6w ago

Learning the Signature of Memorization in Autoregressive Language Models

David Ilić, Kostadin Cvejoski, David Stanojević et al. · JetBrains Research

Learned membership inference attack transferring across transformer, state-space, and recurrent LLM architectures via memorization signatures

Membership Inference Attack nlp

PDF Code

attack arXiv Apr 3, 2026 · 6w ago

A Unified Perspective on Adversarial Membership Manipulation in Vision Models

Ruize Gao, Kaiwen Zhou, Yongqiang Chen et al. · National University of Singapore · Knowin AI +2 more

Adversarial perturbations fool membership inference attacks by fabricating fake members; proposes gradient-based detection and robust inference defenses

Membership Inference Attack Input Manipulation Attack vision

PDF

defense arXiv Apr 2, 2026 · 7w ago

Combating Data Laundering in LLM Training

Muxing Li, Zesheng Ye, Sharon Li et al. · University of Melbourne · University of Wisconsin-Madison

Detects unauthorized LLM training data use even when original data has been laundered through style transformations

Membership Inference Attack Sensitive Information Disclosure nlp

PDF

attack arXiv Apr 1, 2026 · 7w ago

AutoMIA: Improved Baselines for Membership Inference Attack via Agentic Self-Exploration

Ruhao Liu, Weiqi Huang, Qi Li et al. · National University of Singapore

Agentic framework that automates membership inference attacks through self-exploration and strategy evolution, outperforming handcrafted baselines

Membership Inference Attack

PDF Code

attack arXiv Apr 1, 2026 · 7w ago

SERSEM: Selective Entropy-Weighted Scoring for Membership Inference in Code Language Models

Kıvanç Kuzey Dikici, Serdar Kara, Semih Çağlar et al. · Bilkent University

White-box membership inference attack on code LLMs using AST-weighted entropy scoring to detect memorized training data

Membership Inference Attack nlp

PDF

attack arXiv Apr 1, 2026 · 7w ago

G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs

Ravi Ranjan, Utkarsh Grover, Xiaomin Lin et al. · Florida International University · University of South Florida

White-box membership inference attack using gradient-induced feature drift, outperforming confidence-based and reference-based MIAs on LLMs

Membership Inference Attack nlp

PDF

attack arXiv Mar 30, 2026 · 7w ago

\texttt{ReproMIA}: A Comprehensive Analysis of Model Reprogramming for Proactive Membership Inference Attacks

Chihan Huang, Huaijin Wang, Shuai Wang · HKUST

Novel membership inference attack using model reprogramming to amplify privacy leakage signals across LLMs, diffusion models, and classifiers

Membership Inference Attack nlpvisiongenerative

PDF

Loading more papers…

Latest papers

LLM-CEG: Extending the Classification Error Gauge Framework for Privacy Auditing of Large Language Models

Toward Efficient Membership Inference Attacks against Federated Large Language Models: A Projection Residual Approach

Adaptive Defense Orchestration for RAG: A Sentinel-Strategist Architecture against Multi-Vector Attacks

A Data-Free Membership Inference Attack on Federated Learning in Hardware Assurance

A Dual Perspective on Synthetic Trajectory Generators: Utility Framework and Privacy Vulnerabilities

Detecting Data Contamination in Large Language Models

Generalization and Membership Inference Attack a Practical Perspective

DECIFR: Domain-Aware Exfiltration of Circuit Information from Federated Gradient Reconstruction

CiPO: Counterfactual Unlearning for Large Reasoning Models through Iterative Preference Optimization

Evaluating Differential Privacy Against Membership Inference in Federated Learning: Insights from the NIST Genomics Red Team Challenge

CoLA: A Choice Leakage Attack Framework to Expose Privacy Risks in Subset Training

Membership Inference Attacks Expose Participation Privacy in ECG Foundation Encoders

TADP-RME: A Trust-Adaptive Differential Privacy Framework for Enhancing Reliability of Data-Driven Systems

Learning the Signature of Memorization in Autoregressive Language Models

A Unified Perspective on Adversarial Membership Manipulation in Vision Models

Combating Data Laundering in LLM Training

AutoMIA: Improved Baselines for Membership Inference Attack via Agentic Self-Exploration

SERSEM: Selective Entropy-Weighted Scoring for Membership Inference in Code Language Models

G-Drift MIA: Membership Inference via Gradient-Induced Feature Drift in LLMs

\texttt{ReproMIA}: A Comprehensive Analysis of Model Reprogramming for Proactive Membership Inference Attacks

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue