Redirecting to original paper in 30 seconds...
Click below to go immediately or wait for automatic redirect
This work elucidates the mechanisms behind emergent exploration in unsupervised goal-conditioned RL using SGCRL. It shows that SGCRL maximizes implicit rewards shaped by learned low-rank representations, which automatically modify the reward landscape to promote exploration. The understanding enables safety-aware exploration.
Develops more autonomous and efficient learning agents, particularly for robotics, that can explore and learn complex tasks without human supervision or explicit reward engineering, reducing development time and cost.