arxiv_ml 95% Match Research Paper ML Researchers,Graph ML Experts,NLP Researchers,Data Scientists 2 weeks ago

Learning Noise-Resilient and Transferable Graph-Text Alignment via Dynamic Quality Assessment

graph-neural-networks › graph-learning

📄 Abstract

Abstract: Pre-training Graph Foundation Models (GFMs) on text-attributed graphs (TAGs) is central to web-scale applications such as search, recommendation, and knowledge discovery. However, existing CLIP-style graph-text aligners face two key limitations: they assume strict one-to-one correspondences between nodes and texts, overlooking the inherent many-to-many relations in real-world graphs; and they rely on static alignment objectives that cannot adapt to varying data quality, making them brittle under noisy supervision. Together, these limitations expose a core dilemma: embracing expressive many-to-many alignment amplifies noise, while reverting to strict one-to-one strategies sacrifices semantic diversity and fails to handle inherently mismatched pairs. To address these challenges, we propose ADAligner, a dynamic, quality-aware graph-text alignment framework that dynamically adjusts between expressive many-to-many and conservative one-to-one objectives according to supervision quality. ADAligner estimates batch-level alignment reliability in real time and adapts its optimization accordingly, promoting soft, subgraph-level many-to-many alignment when supervision is clean, while emphasizing reliable one-to-one alignment by dynamically filtering low-confidence pairs under noise. Theoretically, we prove that this dynamic mechanism forms a stable negative feedback process, ensuring convergence and robustness. Comprehensive experiments on nine diverse TAG datasets demonstrate that ADAligner consistently outperforms prior graph-text aligners on zero-/few-shot node classification, link prediction and cross-modal retrieval tasks. It maintains strong robustness under noisy supervision and accelerates pre-training by approximately 2 to 3 times compared to multimodal baselines, establishing a scalable and reliable foundation for graph-text representation learning in real-world web environments.

Authors (8)

Yuhang Liu

Minglai Shao

Zengyi Wo

Yunlong Chu

Bing Hao

Shengzhong Liu

+2 more

Submitted

October 22, 2025

arXiv Category

cs.LG

arXiv PDF

Key Contributions

ADAligner addresses limitations in graph-text alignment by proposing a dynamic, quality-aware framework that adaptively switches between many-to-many and one-to-one alignment objectives. This allows for expressive alignment while mitigating noise amplification, leading to more robust and transferable graph foundation models.

Business Value

Improves the accuracy and robustness of web-scale applications like search and recommendation systems by enabling better understanding of text-attributed graphs, leading to more relevant results and personalized experiences.

Paper Metadata

Innovation Type

Algorithmic

Deployment Feasibility

Moderate. Requires large-scale graph and text data, and significant computational resources for pre-training GFMs.

Limitations Addressed

Existing CLIP-style aligners assume strict one-to-one correspondences, ignore many-to-many relations, and rely on static objectives brittle to noisy supervision.

Technical Tags

Graph Foundation Models (GFMs)Graph-Text AlignmentCLIP-styleDynamic Quality AssessmentMany-to-many alignmentOne-to-one alignmentNoise ResilienceSupervision QualityADAlignerTAGs

Research Topics

Graph Representation LearningMultimodal Learning (Graph-Text)Self-Supervised LearningRobustness to NoiseFoundation Models

Methods & Architectures

Dynamic adjustment of alignment objectivesQuality-aware alignmentMany-to-many alignmentOne-to-one alignment ADAlignerGraph Foundation Models (GFMs)

Applications & Tasks

Web-scale Applications Information Retrieval Recommendation Systems Knowledge Discovery Handling many-to-many relationsBrittleness under noisy supervisionAmplification of noiseSacrifice of semantic diversity Graph-text alignmentPre-training Graph Foundation ModelsWeb-scale searchRecommendationKnowledge discovery

Related Fields

Natural Language ProcessingGraph Neural NetworksInformation RetrievalMachine Learning

Keywords

Graph Foundation ModelsGraph-Text AlignmentCLIPDynamic AlignmentNoise ResilienceMany-to-manyTAGsPre-trainingRecommendation SystemsKnowledge GraphsWeb SearchADAligner

Academic Context

#Graph Representation Learning#Multimodal Learning (Graph-Text)#Self-Supervised Learning#Robustness to Noise#Foundation Models

Commercial Potential

Potential Products

Enhanced Search EnginesPersonalized Recommendation PlatformsKnowledge Graph Construction Tools

Target Industries

E-commerceSocial MediaInformation ServicesSearch Engines

Use Case Examples

Improving product recommendations on e-commerce sitesEnhancing search results for complex queriesBuilding more comprehensive knowledge graphs

Competitive Edge

Outperforms existing graph-text alignment methods by dynamically adapting to data quality and handling complex many-to-many relationships, leading to more robust foundation models.

Market Opportunity

Significant market for AI-powered search, recommendation, and knowledge discovery.

Revenue Models

Licensing of foundation modelsAPI accessspecialized AI services.

Resource Requirements

Compute Needs

Very High (for pre-training GFMs)

Data Requirements

Large-scale text-attributed graphs (TAGs)

Deployment Constraints

Scalability challenges for very large graphs; need for high-quality text-graph data.

Scalability

The framework aims to improve scalability by handling noise and complex relations more effectively, but pre-training GFMs remains computationally intensive.

Production Readiness

Maturity Level

Research

Time to Market

3-5 years

Patent Potential

Moderate (for the dynamic alignment methodology)

View Full Paper Back to Papers