Planning how to learn | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://doi.org/10.1109/ICRA.2013.6630972

Title:	Planning how to learn
Authors:	Bai, H. Hsu, D. Lee, W.S.
Issue Date:	2013
Citation:	Bai, H.,Hsu, D.,Lee, W.S. (2013). Planning how to learn. Proceedings - IEEE International Conference on Robotics and Automation : 2853-2859. ScholarBank@NUS Repository. https://doi.org/10.1109/ICRA.2013.6630972
Abstract:	When a robot uses an imperfect system model to plan its actions, a key challenge is the exploration-exploitation trade-off between two sometimes conflicting objectives: (i) learning and improving the model, and (ii) immediate progress towards the goal, according to the current model. To address model uncertainty systematically, we propose to use Bayesian reinforcement learning and cast it as a partially observable Markov decision process (POMDP). We present a simple algorithm for offline POMDP planning in the continuous state space. Offline planning produces a POMDP policy, which can be executed efficiently online as a finite-state controller. This approach seamlessly integrates planning and learning: it incorporates learning objectives in the computed plan, which then enables the robot to learn nearly optimally online and reach the goal. We evaluated the approach in simulations on two distinct tasks, acrobot swing-up and autonomous vehicle navigation amidst pedestrians, and obtained interesting preliminary results. © 2013 IEEE.
Source Title:	Proceedings - IEEE International Conference on Robotics and Automation
URI:	http://scholarbank.nus.edu.sg/handle/10635/78288
ISBN:	9781467356411
ISSN:	10504729
DOI:	10.1109/ICRA.2013.6630972
Appears in Collections:	Staff Publications

Show full item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Altmetric

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.