Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 95% Match Research Paper Graph Machine Learning Researchers,Data Scientists,ML Engineers working with graph data 1 week ago

Graph Data Selection for Domain Adaptation: A Model-Free Approach

graph-neural-networks › graph-learning
📄 Abstract

Abstract: Graph domain adaptation (GDA) is a fundamental task in graph machine learning, with techniques like shift-robust graph neural networks (GNNs) and specialized training procedures to tackle the distribution shift problem. Although these model-centric approaches show promising results, they often struggle with severe shifts and constrained computational resources. To address these challenges, we propose a novel model-free framework, GRADATE (GRAph DATa sElector), that selects the best training data from the source domain for the classification task on the target domain. GRADATE picks training samples without relying on any GNN model's predictions or training recipes, leveraging optimal transport theory to capture and adapt to distribution changes. GRADATE is data-efficient, scalable and meanwhile complements existing model-centric GDA approaches. Through comprehensive empirical studies on several real-world graph-level datasets and multiple covariate shift types, we demonstrate that GRADATE outperforms existing selection methods and enhances off-the-shelf GDA methods with much fewer training data.
Authors (3)
Ting-Wei Li
Ruizhong Qiu
Hanghang Tong
Submitted
May 22, 2025
arXiv Category
cs.LG
arXiv PDF

Key Contributions

This paper proposes GRADATE, a novel model-free framework for graph domain adaptation (GDA) that selects optimal training data from the source domain for a target domain task. Unlike model-centric approaches, GRADATE uses optimal transport theory to adapt to distribution changes without relying on GNN predictions or training recipes. It is data-efficient, scalable, and complements existing GDA methods.

Business Value

Enables the effective application of GNNs to new domains where labeled data is scarce or differs significantly from existing datasets. This accelerates the adoption of GNNs in various industries by reducing the need for extensive retraining or data collection.