Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Knowledge distillation (KD) techniques have emerged as a powerful tool for
transferring expertise from complex teacher models to lightweight student
models, particularly beneficial for deploying high-performance models in
resource-constrained devices. This approach has been successfully applied to
graph neural networks (GNNs), harnessing their expressive capabilities to
generate node embeddings that capture structural and feature-related
information. In this study, we depart from the conventional KD approach by
exploring the potential of collaborative learning among GNNs. In the absence of
a pre-trained teacher model, we show that relatively simple and shallow GNN
architectures can synergetically learn efficient models capable of performing
better during inference, particularly in tackling multiple tasks. We propose a
collaborative learning framework where ensembles of student GNNs mutually teach
each other throughout the training process. We introduce an adaptive logit
weighting unit to facilitate efficient knowledge exchange among models and an
entropy enhancement technique to improve mutual learning. These components
dynamically empower the models to adapt their learning strategies during
training, optimizing their performance for downstream tasks. Extensive
experiments conducted on three datasets each for node and graph classification
demonstrate the effectiveness of our approach.
Authors (6)
Paul Agbaje
Arkajyoti Mitra
Afia Anjum
Pranali Khose
Ebelechukwu Nwafor
Habeeb Olufowobi
Submitted
October 22, 2025
Key Contributions
This paper proposes a mutual learning approach for Graph Neural Networks (GNNs), departing from traditional knowledge distillation. Instead of a teacher-student setup, ensembles of GNNs mutually teach each other throughout training, enabling simpler GNNs to collectively achieve better performance, especially for multi-task learning, without needing a pre-trained teacher model.
Business Value
Enables the deployment of powerful GNN models on edge devices and in applications requiring efficient inference, by allowing smaller models to learn collaboratively and achieve high performance without a large, pre-trained teacher.