Redirecting to original paper in 30 seconds...

Click below to go immediately or wait for automatic redirect

arxiv_ml 91% Match Research Paper Reinforcement learning researchers,Robotics engineers,AI researchers,Machine learning practitioners 20 hours ago

Reset & Distill: A Recipe for Overcoming Negative Transfer in Continual Reinforcement Learning

reinforcement-learning › robotics-rl
📄 Abstract

Abstract: We argue that the negative transfer problem occurring when the new task to learn arrives is an important problem that needs not be overlooked when developing effective Continual Reinforcement Learning (CRL) algorithms. Through comprehensive experimental validation, we demonstrate that such issue frequently exists in CRL and cannot be effectively addressed by several recent work on either mitigating plasticity loss of RL agents or enhancing the positive transfer in CRL scenario. To that end, we develop Reset & Distill (R&D), a simple yet highly effective baseline method, to overcome the negative transfer problem in CRL. R&D combines a strategy of resetting the agent's online actor and critic networks to learn a new task and an offline learning step for distilling the knowledge from the online actor and previous expert's action probabilities. We carried out extensive experiments on long sequence of Meta World tasks and show that our simple baseline method consistently outperforms recent approaches, achieving significantly higher success rates across a range of tasks. Our findings highlight the importance of considering negative transfer in CRL and emphasize the need for robust strategies like R&D to mitigate its detrimental effects.

Key Contributions

Addresses the critical problem of negative transfer in Continual Reinforcement Learning (CRL) with a novel method called Reset & Distill (R&D). R&D resets agent networks for new tasks and distills knowledge from previous experts, effectively mitigating negative transfer and enhancing learning efficiency.

Business Value

Enables AI agents, particularly robots, to learn new skills and adapt to changing environments over time without forgetting previous knowledge, leading to more versatile and adaptable autonomous systems.