Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 95% Match Research Paper GNN Researchers,Machine Learning Theorists,Data Scientists working with graph data 1 week ago

On Vanishing Gradients, Over-Smoothing, and Over-Squashing in GNNs: Bridging Recurrent and Graph Learning

graph-neural-networks β€Ί graph-learning
πŸ“„ Abstract

Abstract: Graph Neural Networks (GNNs) are models that leverage the graph structure to transmit information between nodes, typically through the message-passing operation. While widely successful, this approach is well known to suffer from the over-smoothing and over-squashing phenomena, which result in representational collapse as the number of layers increases and insensitivity to the information contained at distant and poorly connected nodes, respectively. In this paper, we present a unified view of these problems through the lens of vanishing gradients, using ideas from linear control theory for our analysis. We propose an interpretation of GNNs as recurrent models and empirically demonstrate that a simple state-space formulation of a GNN effectively alleviates over-smoothing and over-squashing at no extra trainable parameter cost. Further, we show theoretically and empirically that (i) GNNs are by design prone to extreme gradient vanishing even after a few layers; (ii) Over-smoothing is directly related to the mechanism causing vanishing gradients; (iii) Over-squashing is most easily alleviated by a combination of graph rewiring and vanishing gradient mitigation. We believe our work will help bridge the gap between the recurrent and graph neural network literature and will unlock the design of new deep and performant GNNs.
Authors (8)
Álvaro Arroyo
Alessio Gravina
Benjamin Gutteridge
Federico Barbero
Claudio Gallicchio
Xiaowen Dong
+2 more
Submitted
February 15, 2025
arXiv Category
cs.LG
arXiv PDF

Key Contributions

This paper provides a unified view of over-smoothing and over-squashing in GNNs through the lens of vanishing gradients, using linear control theory. It proposes interpreting GNNs as recurrent models and demonstrates that a simple state-space formulation effectively alleviates these issues without increasing trainable parameters, while also showing GNNs are prone to extreme gradient vanishing.

Business Value

Leads to more stable and deeper GNNs, enabling better performance on complex graph-structured data in areas like social network analysis, recommendation systems, and molecular modeling.