arxiv_ml 90% Match Research Paper Graph Machine Learning Researchers,AI Researchers,Data Scientists working with graph data 3 weeks ago

DARTS-GT: Differentiable Architecture Search for Graph Transformers with Quantifiable Instance-Specific Interpretability Analysis

graph-neural-networks › graph-learning

📄 Abstract

Abstract: Graph Transformers (GTs) have emerged as powerful architectures for graph-structured data, yet remain constrained by rigid designs and lack quantifiable interpretability. Current state-of-the-art GTs commit to fixed GNN types across all layers, missing potential benefits of depth-specific component selection, while their complex architectures become opaque where performance gains cannot be distinguished between meaningful patterns and spurious correlations. We redesign GT attention through asymmetry, decoupling structural encoding from feature representation: queries derive from node features while keys and values come from GNN transformations. Within this framework, we use Differentiable ARchiTecture Search (DARTS) to select optimal GNN operators at each layer, enabling depth-wise heterogeneity inside transformer attention itself (DARTS-GT). To understand discovered architectures, we develop the first quantitative interpretability framework for GTs through causal ablation. Our metrics (Head-deviation, Specialization, and Focus), identify which heads and nodes drive predictions while enabling model comparison. Experiments across eight benchmarks show DARTS-GT achieves state-of-the-art on four datasets while remaining competitive on others, with discovered architectures revealing dataset-specific patterns. Our interpretability analysis reveals that visual attention salience and causal importance do not always correlate, indicating widely used visualization approaches may miss components that actually matter. Crucially, heterogeneous architectures found by DARTS-GT consistently produced more interpretable models than baselines, establishing that Graph Transformers need not choose between performance and interpretability.

Authors (2)

Shruti Sarika Chakraborty

Peter Minary

Submitted

October 16, 2025

arXiv Category

cs.LG

arXiv PDF

Key Contributions

Introduces DARTS-GT, a framework that uses Differentiable Architecture Search (DARTS) to optimize Graph Transformers (GTs) by selecting optimal GNN operators at each layer, enabling depth-wise heterogeneity. It also proposes a novel asymmetric attention mechanism and a causal ablation method for quantifiable, instance-specific interpretability.

Business Value

Enables the development of more powerful and understandable AI models for analyzing complex graph-structured data, leading to better insights in fields like drug discovery and materials science.

Paper Metadata

Innovation Type

Algorithmic Framework and Methodology

Deployment Feasibility

Moderate. DARTS can be computationally intensive, and interpretability analysis adds complexity.

Limitations Addressed

Rigid designs of current GTs, lack of quantifiable interpretability, and the inability to distinguish meaningful patterns from spurious correlations.

Performance Gains

Achieves improved performance and interpretability compared to existing GTs.

Technical Tags

Graph TransformersDifferentiable Architecture Search (DARTS)Graph Neural Networks (GNNs)InterpretabilityCausal AblationAttention MechanismsDepth-specific HeterogeneityAsymmetric AttentionFeature RepresentationStructural Encoding

Research Topics

Graph Representation LearningNeural Architecture SearchExplainable AI (XAI)Deep Learning ArchitecturesGraph Signal Processing

Methods & Architectures

Differentiable ARchiTecture Search (DARTS)Graph Transformers (GTs)Asymmetric attention mechanismDecoupling structural encoding from feature representationCausal ablation for interpretability Graph TransformersGraph Neural Networks (GNNs)

Applications & Tasks

Graph-structured Data Analysis Drug Discovery Social Network Analysis Recommendation Systems Designing optimal Graph Transformer architecturesImproving interpretability of Graph TransformersHandling rigid GT designsDistinguishing meaningful patterns from spurious correlations Architecture search for Graph TransformersLayer-wise GNN operator selectionQuantifiable instance-specific interpretability analysisGraph classification/regression

Related Fields

Machine LearningDeep LearningGraph TheoryCausal Inference

Keywords

Graph TransformersDifferentiable Architecture SearchGraph Neural NetworksInterpretabilityAttentionCausalityDeep LearningGraph LearningNeural Architecture SearchGNNHeterogeneity

Academic Context

#Graph Representation Learning#Neural Architecture Search#Explainable AI (XAI)#Deep Learning Architectures#Graph Signal Processing

Commercial Potential

Potential Products

AI platforms for molecular designIntelligent recommendation enginesNetwork analysis tools

Target Industries

PharmaceuticalsBiotechnologySocial MediaE-commerceTelecommunications

Use Case Examples

Predicting drug-target interactionsAnalyzing protein-protein interaction networksUnderstanding user behavior in social networks

Competitive Edge

Combines architecture search with novel interpretability methods for Graph Transformers, offering a more flexible and understandable approach than fixed-architecture GNNs.

Market Opportunity

Growing demand for advanced graph analytics and interpretable AI.

Revenue Models

Licensing of search algorithmsspecialized AI services for graph analysis.

Resource Requirements

Compute Needs

High (for DARTS)

Data Requirements

Graph-structured datasets.

Deployment Constraints

Computational cost of architecture search,Need for interpretable outputs in sensitive applications

Scalability

Scalability of DARTS can be a concern; the resulting GT architecture's scalability depends on its design.

Regulatory Considerations

Potential for use in regulated industries (e.g.drug discovery)requiring validation.

Production Readiness

Maturity Level

Research

Time to Market

3-5 years

Patent Potential

Moderate

View Full Paper Back to Papers