Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
📄 Abstract
Abstract: Mobile eye tracking plays a vital role in capturing human visual attention
across both real-world and extended reality (XR) environments, making it an
essential tool for applications ranging from behavioural research to
human-computer interaction. However, missing values due to blinks, pupil
detection errors, or illumination changes pose significant challenges for
further gaze data analysis. To address this challenge, we introduce HAGI++ - a
multi-modal diffusion-based approach for gaze data imputation that, for the
first time, uses the integrated head orientation sensors to exploit the
inherent correlation between head and eye movements. HAGI++ employs a
transformer-based diffusion model to learn cross-modal dependencies between eye
and head representations and can be readily extended to incorporate additional
body movements. Extensive evaluations on the large-scale Nymeria, Ego-Exo4D,
and HOT3D datasets demonstrate that HAGI++ consistently outperforms
conventional interpolation methods and deep learning-based time-series
imputation baselines in gaze imputation. Furthermore, statistical analyses
confirm that HAGI++ produces gaze velocity distributions that closely match
actual human gaze behaviour, ensuring more realistic gaze imputations.
Moreover, by incorporating wrist motion captured from commercial wearable
devices, HAGI++ surpasses prior methods that rely on full-body motion capture
in the extreme case of 100% missing gaze data (pure gaze generation). Our
method paves the way for more complete and accurate eye gaze recordings in
real-world settings and has significant potential for enhancing gaze-based
analysis and interaction across various application domains.
Key Contributions
HAGI++ introduces a novel multi-modal diffusion-based approach for gaze data imputation that leverages head orientation sensors to exploit head-eye movement correlations. This method effectively addresses missing values in eye-tracking data and can be extended to incorporate other body movements, outperforming conventional interpolation techniques.
Business Value
Improves the reliability and accuracy of eye-tracking data analysis, which is crucial for applications in XR, user experience research, and assistive technologies, leading to better insights and more responsive interfaces.