ML Security Papers

Latest papers

4 papers

benchmark arXiv Feb 11, 2026 · 7w ago

Generative clinical time series models trained on moderate amounts of patient data are privacy preserving

Rustam Zhumagambetov, Niklas Giesa, Sebastian D. Boie et al. · Physikalisch-Technische Bundesanstalt (PTB) · Charité – Universitätsmedizin Berlin +1 more

Audits generative clinical time series models with membership inference and reconstruction attacks, finding large training sets confer natural privacy protection

Membership Inference Attack Model Inversion Attack timeseries

PDF

Sharing medical data for machine learning model training purposes is often impossible due to the risk of disclosing identifying information about individual patients. Synthetic data produced by generative artificial intelligence (genAI) models trained on real data is often seen as one possible solution to comply with privacy regulations. While powerful genAI models for heterogeneous hospital time series have recently been introduced, such modeling does not guarantee privacy protection, as the generated data may still reveal identifying information about individuals in the models' training cohort. Applying established privacy mechanisms to generative time series models, however, proves challenging as post-hoc data anonymization through k-anonymization or similar techniques is limited, while model-centered privacy mechanisms that implement differential privacy (DP) may lead to unstable training, compromising the utility of generated data. Given these known limitations, privacy audits for generative time series models are currently indispensable regardless of the concrete privacy mechanisms applied to models and/or data. In this work, we use a battery of established privacy attacks to audit state-of-the-art hospital time series models, trained on the public MIMIC-IV dataset, with respect to privacy preservation. Furthermore, the eICU dataset was used to mount a privacy attack against the synthetic data generator trained on the MIMIC-IV dataset. Results show that established privacy attacks are ineffective against generated multivariate clinical time series when synthetic data generators are trained on large enough training datasets. Furthermore, we discuss how the use of existing DP mechanisms for these synthetic data generators would not bring desired improvement in privacy, but only a decrease in utility for machine learning prediction tasks.

generative Physikalisch-Technische Bundesanstalt (PTB) · Charité – Universitätsmedizin Berlin · Technische Universität Berlin

PDF arXiv DOI

benchmark arXiv Feb 10, 2026 · 7w ago

Stop Testing Attacks, Start Diagnosing Defenses: The Four-Checkpoint Framework Reveals Where LLM Safety Breaks

Hayfa Dhahbi, Kashyap Thimmaraju · Technische Universität Berlin

Proposes Four-Checkpoint Framework and WASR metric to diagnose which LLM safety layers break under 13 prompt-level jailbreak techniques

Prompt Injection nlp

PDF

defense arXiv Aug 28, 2025 · Aug 2025

Towards Mechanistic Defenses Against Typographic Attacks in CLIP

Lorenz Hufe, Constantin Venhoff, Erblina Purelku et al. · Fraunhofer Heinrich Hertz Institute · University of Oxford +2 more

Defends CLIP against typographic image-text attacks via gradient-free attention head ablation, improving robustness 22% with <1% accuracy loss

Input Manipulation Attack Prompt Injection visionmultimodal

PDF Code

defense arXiv Aug 18, 2025 · Aug 2025

Beyond Trade-offs: A Unified Framework for Privacy, Robustness, and Communication Efficiency in Federated Learning

Yue Xia, Tayyebeh Jahani-Nezhad, Rawad Bitar · Technical University of Munich · Technische Universität Berlin

Defends federated learning against Byzantine clients using JL-compression-compatible robust aggregation with differential privacy guarantees

Data Poisoning Attack federated-learning

PDF

Latest papers

Generative clinical time series models trained on moderate amounts of patient data are privacy preserving

Stop Testing Attacks, Start Diagnosing Defenses: The Four-Checkpoint Framework Reveals Where LLM Safety Breaks

Towards Mechanistic Defenses Against Typographic Attacks in CLIP

Beyond Trade-offs: A Unified Framework for Privacy, Robustness, and Communication Efficiency in Federated Learning

Filters

Time Period

Paper Type

OWASP ML Top 10

OWASP LLM Top 10

Institution

Venue