Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_cv 93% Match Research Paper Computer Vision Researchers,Robotics Engineers,AI Researchers,Human-Computer Interaction Specialists 3 weeks ago

ProbRes: Probabilistic Jump Diffusion for Open-World Egocentric Activity Recognition

computer-vision › video-understanding
📄 Abstract

Abstract: Open-world egocentric activity recognition poses a fundamental challenge due to its unconstrained nature, requiring models to infer unseen activities from an expansive, partially observed search space. We introduce ProbRes, a Probabilistic Residual search framework based on jump-diffusion that efficiently navigates this space by balancing prior-guided exploration with likelihood-driven exploitation. Our approach integrates structured commonsense priors to construct a semantically coherent search space, adaptively refines predictions using Vision-Language Models (VLMs) and employs a stochastic search mechanism to locate high-likelihood activity labels while minimizing exhaustive enumeration efficiently. We systematically evaluate ProbRes across multiple openness levels (L0-L3), demonstrating its adaptability to increasing search space complexity. In addition to achieving state-of-the-art performance on benchmark datasets (GTEA Gaze, GTEA Gaze+, EPIC-Kitchens, and Charades-Ego), we establish a clear taxonomy for open-world recognition, delineating the challenges and methodological advancements necessary for egocentric activity understanding. Our results highlight the importance of structured search strategies, paving the way for scalable and efficient open-world activity recognition.

Key Contributions

ProbRes introduces a novel probabilistic jump-diffusion search framework for open-world egocentric activity recognition. It effectively balances exploration and exploitation using commonsense priors and VLM refinement, enabling efficient inference of unseen activities from large, partially observed search spaces and achieving state-of-the-art performance.

Business Value

Enables more intelligent and adaptable AI systems for applications like assistive robotics, personalized user interfaces, and advanced surveillance by understanding human activities in real-world, unconstrained environments.