Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ai 95% Match Research Paper Researchers in GNNs and graph analytics,HPC engineers,Data scientists working with large graphs 1 week ago

Plexus: Taming Billion-edge Graphs with 3D Parallel Full-graph GNN Training

graph-neural-networks › graph-learning
📄 Abstract

Abstract: Graph neural networks (GNNs) leverage the connectivity and structure of real-world graphs to learn intricate properties and relationships between nodes. Many real-world graphs exceed the memory capacity of a GPU due to their sheer size, and training GNNs on such graphs requires techniques such as mini-batch sampling to scale. The alternative approach of distributed full-graph training suffers from high communication overheads and load imbalance due to the irregular structure of graphs. We propose a three-dimensional (3D) parallel approach for full-graph training that tackles these issues and scales to billion-edge graphs. In addition, we introduce optimizations such as a double permutation scheme for load balancing, and a performance model to predict the optimal 3D configuration of our parallel implementation -- Plexus. We evaluate Plexus on six different graph datasets and show scaling results on up to 2048 GPUs of Perlmutter, and 1024 GPUs of Frontier. Plexus achieves unprecedented speedups of 2.3-12.5x over prior state of the art, and a reduction in time-to-solution by 5.2-8.7x on Perlmutter and 7.0-54.2x on Frontier.
Authors (4)
Aditya K. Ranjan
Siddharth Singh
Cunyang Wei
Abhinav Bhatele
Submitted
May 7, 2025
arXiv Category
cs.LG
arXiv PDF

Key Contributions

Plexus introduces a novel 3D parallel approach for full-graph GNN training, enabling scalability to billion-edge graphs. It addresses key challenges like high communication overhead and load imbalance through techniques like a double permutation scheme and a performance model for optimal configuration, making large-scale graph analysis more feasible.

Business Value

Enables the training of GNNs on massive real-world graphs, unlocking new possibilities for insights in areas like social networks, recommendation systems, and scientific simulations that were previously computationally infeasible.