Menu

Today's Computer Vision Research Top Papers

Wednesday, November 5, 2025
Introduces RoMA, scaling Mamba-based foundation models for remote sensing with linear complexity. Addresses scalability barriers of Vision Transformers for large models and high-resolution images in supervised tasks.
Proposes HAGI++, a diffusion-based multimodal approach for imputing and generating missing gaze data in real-world and XR environments. Addresses challenges like blinks and tracking errors, enabling better behavioral research and HCI applications.
Introduces a resource-efficient method for automatic segmentation refinement using weak supervision from light feedback. Addresses limitations of foundation models in medical imaging by improving performance with less labor-intensive annotation.
Extends the Forward-Forward (FF) algorithm for training Convolutional Neural Networks (CNNs). Proposes a biologically inspired alternative to backpropagation, enabling CNN training with locally defined goodness functions.
Proposes FreqSal, a purely Fourier Transform-based model for RGB-T salient object detection. Overcomes quadratic complexity limitations of Transformer models, enabling efficient bimodal feature fusion for high-resolution images.
Introduces a new mobile robotic system for Multi-View Photometric Stereo (MVPS) 3D acquisition. Enables MVPS benefits on movable platforms, expanding 3D acquisition capabilities for mobile robotics applications.
Proposes CyclicPrompt, a cyclic prompting approach for Universal Adverse Weather Removal (UAWR). Enhances effectiveness, adaptability, and generalizability of weather-free image restoration using prompt learning with vision-language models.
Proposes a Geometry-aware Temporal Aggregation Network to address monocular ambiguity in 3D lane detection. Exploits temporal evolution information to improve geometric predictions and lane integrity, especially for distant lanes.
Proposes GeoSDF, a text-to-3D framework for generating 3D plane geometry diagrams using Signed Distance Fields. Addresses challenges in creating intricate structures by leveraging 3D priors and 2D diffusion models.
Proposes Crucial-Diff, a unified diffusion model for synthesizing crucial images and annotations in data-scarce scenarios. Addresses model overfitting and dataset imbalance by generating targeted training samples to improve detection and segmentation.
Sort by:

Loading more papers...

📚 You've reached the end of the papers list