arxiv_ai 95% Match Research Paper Computer Vision Researchers,GNN Researchers,ML Engineers 3 weeks ago

Multi-Scale High-Resolution Logarithmic Grapher Module for Efficient Vision GNNs

graph-neural-networks › graph-learning

📄 Abstract

Abstract: Vision graph neural networks (ViG) have demonstrated promise in vision tasks as a competitive alternative to conventional convolutional neural nets (CNN) and transformers (ViTs); however, common graph construction methods, such as k-nearest neighbor (KNN), can be expensive on larger images. While methods such as Sparse Vision Graph Attention (SVGA) have shown promise, SVGA's fixed step scale can lead to over-squashing and missing multiple connections to gain the same information that could be gained from a long-range link. Through this observation, we propose a new graph construction method, Logarithmic Scalable Graph Construction (LSGC) to enhance performance by limiting the number of long-range links. To this end, we propose LogViG, a novel hybrid CNN-GNN model that utilizes LSGC. Furthermore, inspired by the successes of multi-scale and high-resolution architectures, we introduce and apply a high-resolution branch and fuse features between our high-resolution and low-resolution branches for a multi-scale high-resolution Vision GNN network. Extensive experiments show that LogViG beats existing ViG, CNN, and ViT architectures in terms of accuracy, GMACs, and parameters on image classification and semantic segmentation tasks. Our smallest model, Ti-LogViG, achieves an average top-1 accuracy on ImageNet-1K of 79.9% with a standard deviation of 0.2%, 1.7% higher average accuracy than Vision GNN with a 24.3% reduction in parameters and 35.3% reduction in GMACs. Our work shows that leveraging long-range links in graph construction for ViGs through our proposed LSGC can exceed the performance of current state-of-the-art ViGs. Code is available at https://github.com/mmunir127/LogViG-Official.

Authors (3)

Mustafa Munir

Alex Zhang

Radu Marculescu

Submitted

October 15, 2025

arXiv Category

cs.CV

Proceedings of the Third Learning on Graphs Conference (LoG 2024), PMLR 269:37:1-37:13 2024

arXiv PDF

Key Contributions

Proposes Logarithmic Scalable Graph Construction (LSGC) to efficiently build graphs for Vision GNNs by limiting long-range links, addressing over-squashing and missing connections. Introduces LogViG, a hybrid CNN-GNN model utilizing LSGC and multi-scale feature fusion for enhanced performance on vision tasks.

Business Value

Enables more efficient and effective application of Graph Neural Networks to large-scale vision tasks, potentially leading to better image analysis tools in various industries.

Paper Metadata

Innovation Type

Algorithmic Improvement / Novel Architecture

Deployment Feasibility

Moderate, depends on the complexity of the hybrid architecture and the efficiency of the LSGC method in practice.

Limitations Addressed

Addresses the computational expense of traditional graph construction methods like KNN and the limitations of fixed-scale attention mechanisms (SVGA) that can lead to over-squashing and missing crucial long-range information.

Performance Gains

Enhances performance by limiting the number of long-range links and effectively fusing multi-scale features.

Technical Tags

Vision GNNsGraph ConstructionLogarithmic Scalable Graph Construction (LSGC)Multi-scaleHigh-resolutionCNN-GNN hybridk-nearest neighbor (KNN)Sparse Vision Graph Attention (SVGA)over-squashinglong-range linksfeature fusion

Research Topics

Graph Neural NetworksComputer VisionDeep Learning ArchitecturesImage AnalysisRepresentation Learning

Methods & Architectures

Logarithmic Scalable Graph Construction (LSGC)LogViGHybrid CNN-GNN modelMulti-scale feature fusion Vision Graph Neural Network (ViG)Hybrid CNN-GNN

Applications & Tasks

Image Recognition Computer Vision Tasks Medical Imaging Analysis Remote Sensing Expensive Graph Construction (KNN)Over-squashing in SVGAMissing Long-Range ConnectionsBalancing Local and Global Information Image ClassificationObject DetectionSemantic SegmentationImage Understanding

Related Fields

Graph Neural NetworksComputer VisionDeep LearningImage ProcessingMachine Learning

Keywords

vision gnngraph neural networkscomputer visiongraph constructionLSGCLogViGmulti-scalehigh-resolutionCNN-GNNimage analysislong-range linksover-squashing

Academic Context

#Graph Neural Networks#Computer Vision#Deep Learning Architectures#Image Analysis#Representation Learning

Commercial Potential

Potential Products

More efficient image analysis librariesAdvanced computer vision models for specific tasks

Target Industries

TechnologyHealthcare (Medical Imaging)SecurityAutonomous Systems

Use Case Examples

Improving image classification accuracy.Enhancing object detection in complex scenes.Enabling more detailed semantic segmentation of images.

Competitive Edge

Offers a novel graph construction method (LSGC) and a hybrid architecture (LogViG) that aims to improve upon existing Vision GNN approaches by addressing efficiency and information capture.

Market Opportunity

Large market for advanced computer vision solutions.

Revenue Models

Licensing of the technologyintegration into commercial CV platforms.

Resource Requirements

Compute Needs

Likely requires significant GPU resources for training, especially with high-resolution images.

Data Requirements

Standard image datasets for computer vision tasks.

Deployment Constraints

Model complexity and inference speed might be considerations.

Scalability

The LSGC method aims to improve scalability by making graph construction more efficient.

Production Readiness

Maturity Level

Research

Time to Market

2-3 years for integration into specialized computer vision pipelines.

Patent Potential

Moderate, for the LSGC method and LogViG architecture.

View Full Paper Back to Papers