Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 90% Match Research Paper Graph Machine Learning Researchers,AI Researchers,Data Scientists working with graph data 3 weeks ago

DARTS-GT: Differentiable Architecture Search for Graph Transformers with Quantifiable Instance-Specific Interpretability Analysis

graph-neural-networks › graph-learning
📄 Abstract

Abstract: Graph Transformers (GTs) have emerged as powerful architectures for graph-structured data, yet remain constrained by rigid designs and lack quantifiable interpretability. Current state-of-the-art GTs commit to fixed GNN types across all layers, missing potential benefits of depth-specific component selection, while their complex architectures become opaque where performance gains cannot be distinguished between meaningful patterns and spurious correlations. We redesign GT attention through asymmetry, decoupling structural encoding from feature representation: queries derive from node features while keys and values come from GNN transformations. Within this framework, we use Differentiable ARchiTecture Search (DARTS) to select optimal GNN operators at each layer, enabling depth-wise heterogeneity inside transformer attention itself (DARTS-GT). To understand discovered architectures, we develop the first quantitative interpretability framework for GTs through causal ablation. Our metrics (Head-deviation, Specialization, and Focus), identify which heads and nodes drive predictions while enabling model comparison. Experiments across eight benchmarks show DARTS-GT achieves state-of-the-art on four datasets while remaining competitive on others, with discovered architectures revealing dataset-specific patterns. Our interpretability analysis reveals that visual attention salience and causal importance do not always correlate, indicating widely used visualization approaches may miss components that actually matter. Crucially, heterogeneous architectures found by DARTS-GT consistently produced more interpretable models than baselines, establishing that Graph Transformers need not choose between performance and interpretability.
Authors (2)
Shruti Sarika Chakraborty
Peter Minary
Submitted
October 16, 2025
arXiv Category
cs.LG
arXiv PDF

Key Contributions

Introduces DARTS-GT, a framework that uses Differentiable Architecture Search (DARTS) to optimize Graph Transformers (GTs) by selecting optimal GNN operators at each layer, enabling depth-wise heterogeneity. It also proposes a novel asymmetric attention mechanism and a causal ablation method for quantifiable, instance-specific interpretability.

Business Value

Enables the development of more powerful and understandable AI models for analyzing complex graph-structured data, leading to better insights in fields like drug discovery and materials science.