arxiv_cv
Abstract: Abstract: Monocular 3D lane detection aims to estimate the 3D position of lanes from
frontal-view (FV) images. However, existing methods are fundamentally
constrained by the inherent ambiguity of single-frame input, which leads to
inaccurate geometri...
#3D Computer Vision#Autonomous Driving Perception#Deep Learning#Temporal Modeling#Scene Understanding
arxiv_cv
Abstract: Abstract: The rapid development of deep learning has significantly improved salient
object detection (SOD) combining both RGB and thermal (RGB-T) images. However,
existing Transformer-based RGB-T SOD models with quadratic complexity are
memory-intens...
#Salient Object Detection#Multi-modal Fusion#Computer Vision#Deep Learning#Efficient Architectures
arxiv_cv
Abstract: Abstract: Mobile eye tracking plays a vital role in capturing human visual attention
across both real-world and extended reality (XR) environments, making it an
essential tool for applications ranging from behavioural research to
human-computer inter...
#Human-Computer Interaction#Computer Vision#Machine Learning#Data Imputation#Sensor Fusion
arxiv_cv
Abstract: Abstract: Delineating anatomical regions is a key task in medical image analysis.
Manual segmentation achieves high accuracy but is labor-intensive and prone to
variability, thus prompting the development of automated approaches. Recently,
a breadth ...
#Medical Image Segmentation#Weakly Supervised Learning#Foundation Models in Healthcare#Image Analysis#Machine Learning
arxiv_cv
Abstract: Abstract: The field of autonomous driving technology is rapidly advancing, with deep
learning being a key component. Particularly in the field of sensing, 3D point
cloud data collected by LiDAR is utilized to run deep neural network models for
3D obj...
#Computer Vision#Autonomous Driving#Edge Computing#Distributed Machine Learning#Deep Learning Optimization
arxiv_cv
Abstract: Abstract: Federated learning is a renowned technique for utilizing decentralized data
while preserving privacy. However, real-world applications often face
challenges like partially labeled datasets, where only a few locations have
certain expert ann...
#Federated Learning#Medical Image Analysis#Semi-Supervised Learning#Model Compression#Privacy in AI
arxiv_cv
Abstract: Abstract: Plane Geometry Diagram Synthesis has been a crucial task in computer
graphics, with applications ranging from educational tools to AI-driven
mathematical reasoning. Traditionally, we rely on manual tools (e.g.,
Matplotlib and GeoGebra) to g...
#Computer Graphics#Geometric Modeling#AI for Education#Procedural Content Generation#Mathematical Reasoning
arxiv_cv
Abstract: Abstract: Universal adverse weather removal (UAWR) seeks to address various weather
degradations within a unified framework. Recent methods are inspired by prompt
learning using pre-trained vision-language models (e.g., CLIP), leveraging
degradation-...
#Image Restoration#Computer Vision#Generative AI#Prompt Engineering#Adverse Weather Effects
arxiv_cv
Abstract: Abstract: Rich and accurate medical image segmentation is poised to underpin the next
generation of AI-defined clinical practice by delineating critical anatomy for
pre-operative planning, guiding real-time intra-operative navigation, and
supporting ...
#Medical Image Segmentation#Deep Learning Losses#Hierarchical Classification#Weakly Supervised Learning#Computer Vision
arxiv_cv
Abstract: Abstract: Event cameras offer microsecond-level latency and robustness to motion blur,
making them ideal for understanding dynamic environments. Yet, connecting these
asynchronous streams to human language remains an open challenge. We introduce
Talk...
#Event-Based Vision#Multimodal Understanding#Language Grounding#Scene Understanding#Robotics Perception
arxiv_cv
Abstract: Abstract: Recent advances in self-supervised learning for Vision Transformers (ViTs)
have fueled breakthroughs in remote sensing (RS) foundation models. However,
the quadratic complexity of self-attention poses a significant barrier to
scalability, p...
#Remote Sensing Image Analysis#Foundation Models#Self-Supervised Learning#Efficient Deep Learning Architectures#Computer Vision
arxiv_cv
Abstract: Abstract: Visible-infrared person re-identification (VI-ReID) technique could associate
the pedestrian images across visible and infrared modalities in the practical
scenarios of background illumination changes. However, a substantial gap
inherently ...
#Person Re-Identification#Multimodal Learning#Representation Learning#Computer Vision#Image Generation
arxiv_cv
Abstract: Abstract: Shearography is a non-destructive testing method for detecting subsurface
defects, offering high sensitivity and full-field inspection capabilities.
However, its industrial adoption remains limited due to the need for expert
interpretation....
#Computer Vision#Unsupervised Learning#Industrial Inspection#Non-Destructive Testing (NDT)#Anomaly Detection#Machine Learning for Manufacturing
arxiv_cv
Abstract: Abstract: Mobile sensing systems have long faced a fundamental trade-off between
sensing quality and efficiency due to constraints in computation, power, and
other limitations. Sparse sensing, which aims to acquire and process only a
subset of sensor...
#Augmented Reality (AR)#Mobile Sensing#Foundation Models#3D Computer Vision#Efficient AI
arxiv_cl
Abstract: Abstract: Code has emerged as a precise and executable medium for reasoning and action
in the agent era. Yet, progress has largely focused on language-centric tasks
such as program synthesis and debugging, leaving visual-centric coding
underexplored....
#Multimodal AI#Vision-Language Models#Code Generation#Benchmarking#Symbolic Reasoning
arxiv_cv
Abstract: Abstract: Large-scale chemical reaction datasets are crucial for AI research in
chemistry. However, existing chemical reaction data often exist as images
within papers, making them not machine-readable and unusable for training
machine learning model...
#AI in Chemistry#Computer Vision#Natural Language Processing#Multimodal Learning#Information Extraction
arxiv_ml
Abstract: Abstract: Existing benchmarks for multimodal learning in Earth science offer limited,
siloed coverage of Earth's spheres and their cross-sphere interactions,
typically restricting evaluation to the human-activity sphere of atmosphere and
to at most 1...
#Earth Science#Multimodal Learning#Machine Learning Benchmarking#Climate Modeling#Environmental Science
arxiv_cv
Abstract: Abstract: Geo-Foundational Models (GFMs) enable fast and reliable extraction of
spatiotemporal information from satellite imagery, improving flood inundation
mapping by leveraging location and time embeddings. Despite their potential, it
remains uncl...
#Remote Sensing#Geospatial AI#Environmental Monitoring#Computer Vision#Deep Learning
arxiv_cv
Abstract: Abstract: Alzheimer's disease (AD) is the most prevalent form of dementia, and its
early diagnosis is essential for slowing disease progression. Recent studies on
multimodal neuroimaging fusion using MRI and PET have achieved promising
results by int...
#Medical Imaging#Neuroscience#Machine Learning for Healthcare#Multimodal Learning#Disease Diagnosis
arxiv_ml
Abstract: Abstract: The coupling signal refers to a latent physiological signal that
characterizes the transformation from cardiac electrical excitation, captured
by the electrocardiogram (ECG), to mechanical contraction, recorded by the
phonocardiogram (PCG)....
#Medical Signal Analysis#Multi-modal Fusion#Robust Machine Learning#Cardiovascular Health#Biomedical Engineering#Data-driven Diagnostics
arxiv_ml
Abstract: Abstract: As a computer vision task, automatic object segmentation remains challenging
in specialized image domains without massive labeled data, such as synthetic
aperture sonar images, remote sensing, biomedical imaging, etc. In any domain,
obtaini...
#Computer Vision#Image Segmentation#Weakly Supervised Learning#Generative Models#Domain Adaptation
arxiv_cv
Abstract: Abstract: Purpose: In this paper, we develop and clinically evaluate a depth-only,
markerless augmented reality (AR) registration pipeline on a head-mounted
display, and assess accuracy across small or low-curvature anatomies in
real-life operative s...
#Augmented Reality#Medical Imaging#Computer Vision#Surgical Robotics#Medical Devices#3D Registration
arxiv_cv
Abstract: Abstract: Portable physiological monitoring is essential for early detection and
management of cardiovascular disease, but current methods often require
specialized equipment that limits accessibility or impose impractical postures
that patients cann...
#Physiological Monitoring#Medical Informatics#Computer Vision#Signal Processing#Dataset Creation
arxiv_cv
Abstract: Abstract: With the rapid growth of the low-altitude economy, UAVs have become crucial
for measurement and tracking in patrol systems. However, in GNSS-denied areas,
satellite-based localization methods are prone to failure. This paper presents
a cros...
#Robotics#Computer Vision#Localization#Remote Sensing#Geospatial Analysis
arxiv_cv
Abstract: Abstract: Sparse-voxel rasterization is a fast, differentiable alternative for
optimization-based scene reconstruction, but it tends to underfit low-frequency
content, depends on brittle pruning heuristics, and can overgrow in ways that
inflate VRAM....
#Computer Vision#3D Reconstruction#Differentiable Rendering#Scene Representation#Graphics#Neural Rendering
arxiv_cv
Abstract: Abstract: Color consistency correction for color point clouds is a fundamental yet
important task in 3D rendering and compression applications. In the past, most
previous color correction methods aimed at correcting color for color images.
The purpos...
#Computer Vision#3D Graphics#Point Cloud Processing#Image Processing#Geometric Modeling
arxiv_cv
Abstract: Abstract: We address the problem of image reconstruction from incomplete measurements,
encompassing both upsampling and inpainting, within a learning-based framework.
Conventional supervised approaches require fully sampled ground truth data,
while s...
#Image Reconstruction#Low-Data Learning#Signal Processing#Medical Imaging#Computer Vision
arxiv_cv
Abstract: Abstract: Vision Transformers rely on fixed patch tokens that ignore the spatial and
semantic structure of images. In this work, we introduce an end-to-end
differentiable tokenizer that adapts to image content with pixel-level
granularity while remai...
#Vision Transformers#Image Representation#Tokenization#Deep Learning Architectures#Computer Vision
arxiv_cv
Abstract: Abstract: We present PercHead, a method for single-image 3D head reconstruction and
semantic 3D editing - two tasks that are inherently challenging due to severe
view occlusions, weak perceptual supervision, and the ambiguity of editing in
3D space. ...
#3D Computer Vision#Generative Models#Image Reconstruction#Perceptual Learning#Human Face Modeling
arxiv_cv
Abstract: Abstract: Segmental longitudinal strain (SLS) of the left ventricle (LV) is an
important prognostic indicator for evaluating regional LV dysfunction, in
particular for diagnosing and managing myocardial ischemia. Current techniques
for strain estimat...
#Medical Imaging#Cardiology#Deep Learning#Motion Analysis#Prognostic Indicators