On Measuring Unnoticeability of Graph Adversarial Attacks: Observations, New Measure, and Applications

Adversarial attacks are allegedly unnoticeable. Prior studies have designed attack noticeability measures on graphs, primarily using statistical tests to compare the topology of original and (possibly) attacked graphs. However, we observe two critical limitations in the existing measures. First, because the measures rely on simple rules, attackers can readily enhance their attacks to bypass them, reducing their attack "noticeability" and, yet, maintaining their attack performance. Second, because the measures naively leverage global statistics, such as degree distributions, they may entirely overlook attacks until severe perturbations occur, letting the attacks be almost "totally unnoticeable." To address the limitations, we introduce HideNSeek, a learnable measure for graph attack noticeability. First, to mitigate the bypass problem, HideNSeek learns to distinguish the original and (potential) attack edges using a learnable edge scorer (LEO), which scores each edge on its likelihood of being an attack. Second, to mitigate the overlooking problem, HideNSeek conducts imbalance-aware aggregation of all the edge scores to obtain the final noticeability score. Using six real-world graphs, we empirically demonstrate that HideNSeek effectively alleviates the observed limitations, and LEO (i.e., our learnable edge scorer) outperforms eleven competitors in distinguishing attack edges under five different attack methods. For an additional application, we show that LEO boost the performance of robust GNNs by removing attack-like edges.

Key Contributions

Identifies two critical limitations of existing graph attack noticeability measures: they can be bypassed by simple attacker adaptations, and they overlook attacks by relying on naive global statistics until severe perturbations occur
Proposes HideNSeek, a learnable noticeability measure combining LEO (Learnable Edge Scorer) — which scores each edge on its likelihood of being adversarial — with imbalance-aware aggregation of edge scores into a final noticeability score
Demonstrates that LEO outperforms 11 competing edge classifiers under 5 different graph attack methods on 6 real-world graphs, and that it can serve as a pre-processing defense by removing attack-like edges to boost robust GNN performance

🛡️ Threat Analysis

Input Manipulation Attack

Directly targets adversarial attacks on graph neural networks — structural perturbations (edge additions/removals) crafted at inference time to evade detection while degrading GNN performance. The paper's core contribution, HideNSeek/LEO, detects these adversarial graph perturbations and its application removes attack-like edges to defend GNNs.