attack arXiv Sep 25, 2025 · Sep 2025
Dario Loi, Matteo Silvestri, Fabrizio Silvestri et al. · Sapienza University of Rome
DRL-based graph evasion attack injects proxy nodes to hide community membership from overlapping graph detectors
Input Manipulation Attack graph
Protecting privacy in social graphs requires preventing sensitive information, such as community affiliations, from being inferred by graph analysis, without substantially altering the graph topology. We address this through the problem of \emph{community membership hiding} (CMH), which seeks edge modifications that cause a target node to exit its original community, regardless of the detection algorithm employed. Prior work has focused on non-overlapping community detection, where trivial strategies often suffice, but real-world graphs are better modeled by overlapping communities, where such strategies fail. To the best of our knowledge, we are the first to formalize and address CMH in this setting. In this work, we propose a deep reinforcement learning (DRL) approach that learns effective modification policies, including the use of proxy nodes, while preserving graph structure. Experiments on real-world datasets show that our method significantly outperforms existing baselines in both effectiveness and efficiency, offering a principled tool for privacy-preserving graph modification with overlapping communities.
rl gnn Sapienza University of Rome
defense arXiv Nov 24, 2025 · Nov 2025
Mostafa Mozafari, Farooq Ahmad Wani, Maria Sofia Bucarelli et al. · Sapienza University of Rome
Removes backdoor triggers and label-noise poisoning post-training via task arithmetic weight subtraction without original training data
Model Poisoning Data Poisoning Attack vision
Corrupted training data are ubiquitous. Corrective Machine Unlearning (CMU) seeks to remove the influence of such corruption post-training. Prior CMU typically assumes access to identified corrupted training samples (a "forget set"). However, in many real-world scenarios the training data are no longer accessible. We formalize source-free CMU, where the original training data are unavailable and, consequently, no forget set of identified corrupted training samples can be specified. Instead, we assume a small proxy (surrogate) set of corrupted samples that reflect the suspected corruption type without needing to be the original training samples. In this stricter setting, methods relying on forget set are ineffective or narrow in scope. We introduce Corrective Unlearning in Task Space (CUTS), a lightweight weight space correction method guided by the proxy set using task arithmetic principles. CUTS treats the clean and the corruption signal as distinct tasks. Specifically, we briefly fine-tune the corrupted model on the proxy to amplify the corruption mechanism in the weight space, compute the difference between the corrupted and fine-tuned weights as a proxy task vector, and subtract a calibrated multiple of this vector to cancel the corruption. Without access to clean data or a forget set, CUTS recovers a large fraction of the lost utility under label noise and, for backdoor triggers, nearly eliminates the attack with minimal damage to utility, outperforming state-of-the-art specialized CMU methods in source-free setting.
transformer Sapienza University of Rome