Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
This paper presents online Video Depth Anything (oVDA), enabling temporally-consistent depth prediction from monocular video in an online setting with low memory consumption. It innovatively applies LLM techniques like latent feature caching and frame masking to overcome the batch-processing limitations of previous methods, making it suitable for edge devices.
Enables real-time, accurate depth perception for applications on resource-constrained devices like drones, mobile robots, and AR/VR headsets, significantly expanding the possibilities for mobile AI applications.