Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/246253
Title: | HIERARCHICAL REINFORCEMENT LEARNING WITH PARAMETERIZED OPTIONS FOR LONG-HORIZON ROBOTIC MANIPULATION | Authors: | GUO CHAOQUN | ORCID iD: | orcid.org/0009-0006-2515-8113 | Keywords: | reinforcement learning, robotic manipulation, active demonstration, hierarchical reinforcement learning | Issue Date: | 8-Aug-2023 | Citation: | GUO CHAOQUN (2023-08-08). HIERARCHICAL REINFORCEMENT LEARNING WITH PARAMETERIZED OPTIONS FOR LONG-HORIZON ROBOTIC MANIPULATION. ScholarBank@NUS Repository. | Abstract: | Hierarchical Reinforcement Learning (HRL) is a promising approach for addressing long-horizon robotic manipulation tasks with sparse rewards. In the parameterized options framework of HRL, a high-level policy selects a skill and its corresponding low-level goal parameters from a pre-trained skill library, allowing shared skills across tasks. However, fixed skills can lead to poor performance when skills fail to generalize. This work introduces a novel hierarchical algorithm for joint training of two-level policies in the parameterized options framework under sparse reward settings. Three key contributions are made in this thesis. First, a skill library is developed using off-the-shelf RL algorithms for quick learning of simple actions, emphasizing the importance of joint policy training for skill generalization across tasks. Second, the thesis presents a novel hierarchical architecture, Hier-P-DQN, and incorporates high-level active demonstration to ensure stable learning. Lastly, staged sparse rewards and high-level hindsight experience replay (HER) are used to expedite learning. Through extensive experimentation, Hier-P-DQN outperforms baseline methods like DDPG+HER and Behavioral cloning in long-horizon robotic manipulation tasks with sparse rewards. It achieves impressive performance with significantly fewer environment interactions, requiring only 1e4o 1.5e4 episodes, much less than traditional RL methods. Additionally, obtaining high-level demonstrations is easier compared to traditional approaches. | URI: | https://scholarbank.nus.edu.sg/handle/10635/246253 |
Appears in Collections: | Master's Theses (Open) |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
GuoChaoqun.pdf | 10.41 MB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.