Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 93% Match Research Paper Robotics Engineers,Autonomous Driving Researchers,Computer Vision Scientists,AI Researchers 2 weeks ago

Towards 3D Objectness Learning in an Open World

computer-vision › object-detection
📄 Abstract

Abstract: Recent advancements in 3D object detection and novel category detection have made significant progress, yet research on learning generalized 3D objectness remains insufficient. In this paper, we delve into learning open-world 3D objectness, which focuses on detecting all objects in a 3D scene, including novel objects unseen during training. Traditional closed-set 3D detectors struggle to generalize to open-world scenarios, while directly incorporating 3D open-vocabulary models for open-world ability struggles with vocabulary expansion and semantic overlap. To achieve generalized 3D object discovery, We propose OP3Det, a class-agnostic Open-World Prompt-free 3D Detector to detect any objects within 3D scenes without relying on hand-crafted text prompts. We introduce the strong generalization and zero-shot capabilities of 2D foundation models, utilizing both 2D semantic priors and 3D geometric priors for class-agnostic proposals to broaden 3D object discovery. Then, by integrating complementary information from point cloud and RGB image in the cross-modal mixture of experts, OP3Det dynamically routes uni-modal and multi-modal features to learn generalized 3D objectness. Extensive experiments demonstrate the extraordinary performance of OP3Det, which significantly surpasses existing open-world 3D detectors by up to 16.0% in AR and achieves a 13.5% improvement compared to closed-world 3D detectors.
Authors (5)
Taichi Liu
Zhenyu Wang
Ruofeng Liu
Guang Wang
Desheng Zhang
Submitted
October 20, 2025
arXiv Category
cs.CV
arXiv PDF

Key Contributions

OP3Det is a class-agnostic, prompt-free 3D detector for open-world objectness learning. It leverages 2D foundation models and integrates both 2D semantic and 3D geometric priors to achieve generalized 3D object discovery without relying on hand-crafted text prompts or predefined categories.

Business Value

Enables robots and autonomous systems to perceive and interact with a wider range of objects in unstructured environments, improving adaptability and safety.