Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Vision graph neural networks (ViG) have demonstrated promise in vision tasks
as a competitive alternative to conventional convolutional neural nets (CNN)
and transformers (ViTs); however, common graph construction methods, such as
k-nearest neighbor (KNN), can be expensive on larger images. While methods such
as Sparse Vision Graph Attention (SVGA) have shown promise, SVGA's fixed step
scale can lead to over-squashing and missing multiple connections to gain the
same information that could be gained from a long-range link. Through this
observation, we propose a new graph construction method, Logarithmic Scalable
Graph Construction (LSGC) to enhance performance by limiting the number of
long-range links. To this end, we propose LogViG, a novel hybrid CNN-GNN model
that utilizes LSGC. Furthermore, inspired by the successes of multi-scale and
high-resolution architectures, we introduce and apply a high-resolution branch
and fuse features between our high-resolution and low-resolution branches for a
multi-scale high-resolution Vision GNN network. Extensive experiments show that
LogViG beats existing ViG, CNN, and ViT architectures in terms of accuracy,
GMACs, and parameters on image classification and semantic segmentation tasks.
Our smallest model, Ti-LogViG, achieves an average top-1 accuracy on
ImageNet-1K of 79.9% with a standard deviation of 0.2%, 1.7% higher average
accuracy than Vision GNN with a 24.3% reduction in parameters and 35.3%
reduction in GMACs. Our work shows that leveraging long-range links in graph
construction for ViGs through our proposed LSGC can exceed the performance of
current state-of-the-art ViGs. Code is available at
https://github.com/mmunir127/LogViG-Official.
Authors (3)
Mustafa Munir
Alex Zhang
Radu Marculescu
Submitted
October 15, 2025
Proceedings of the Third Learning on Graphs Conference (LoG 2024),
PMLR 269:37:1-37:13 2024
Key Contributions
Proposes Logarithmic Scalable Graph Construction (LSGC) to efficiently build graphs for Vision GNNs by limiting long-range links, addressing over-squashing and missing connections. Introduces LogViG, a hybrid CNN-GNN model utilizing LSGC and multi-scale feature fusion for enhanced performance on vision tasks.
Business Value
Enables more efficient and effective application of Graph Neural Networks to large-scale vision tasks, potentially leading to better image analysis tools in various industries.