Hamed Haddadi

h-index: 2 25 citations 11 papers (total)

Papers in Database (2)

defense arXiv Sep 27, 2025 · Sep 2025

GuardNet: Graph-Attention Filtering for Jailbreak Defense in Large Language Models

Javad Forough, Mohammad Maheri, Hamed Haddadi · Imperial College London

GNN-based hierarchical filter detects and localizes jailbreak prompts in LLMs, achieving 99.8% F1 on LLM-Fuzzer

Prompt Injection nlpgraph
1 citations PDF
defense arXiv Nov 29, 2025 · Nov 2025

Teleportation-Based Defenses for Privacy in Approximate Machine Unlearning

Mohammad M Maheri, Xavier Cadet, Peter Chin et al. · Imperial College London · Dartmouth College

Proposes WARP teleportation defense that obfuscates unlearning signals, resisting membership inference and data reconstruction attacks

Membership Inference Attack Model Inversion Attack vision
PDF