Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
π Abstract
Abstract: Graph Neural Networks (GNNs) are models that leverage the graph structure to
transmit information between nodes, typically through the message-passing
operation. While widely successful, this approach is well known to suffer from
the over-smoothing and over-squashing phenomena, which result in
representational collapse as the number of layers increases and insensitivity
to the information contained at distant and poorly connected nodes,
respectively. In this paper, we present a unified view of these problems
through the lens of vanishing gradients, using ideas from linear control theory
for our analysis. We propose an interpretation of GNNs as recurrent models and
empirically demonstrate that a simple state-space formulation of a GNN
effectively alleviates over-smoothing and over-squashing at no extra trainable
parameter cost. Further, we show theoretically and empirically that (i) GNNs
are by design prone to extreme gradient vanishing even after a few layers; (ii)
Over-smoothing is directly related to the mechanism causing vanishing
gradients; (iii) Over-squashing is most easily alleviated by a combination of
graph rewiring and vanishing gradient mitigation. We believe our work will help
bridge the gap between the recurrent and graph neural network literature and
will unlock the design of new deep and performant GNNs.
Authors (8)
Γlvaro Arroyo
Alessio Gravina
Benjamin Gutteridge
Federico Barbero
Claudio Gallicchio
Xiaowen Dong
+2 more
Submitted
February 15, 2025
Key Contributions
This paper provides a unified view of over-smoothing and over-squashing in GNNs through the lens of vanishing gradients, using linear control theory. It proposes interpreting GNNs as recurrent models and demonstrates that a simple state-space formulation effectively alleviates these issues without increasing trainable parameters, while also showing GNNs are prone to extreme gradient vanishing.
Business Value
Leads to more stable and deeper GNNs, enabling better performance on complex graph-structured data in areas like social network analysis, recommendation systems, and molecular modeling.