Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/246253
Title: HIERARCHICAL REINFORCEMENT LEARNING WITH PARAMETERIZED OPTIONS FOR LONG-HORIZON ROBOTIC MANIPULATION
Authors: GUO CHAOQUN
ORCID iD:   orcid.org/0009-0006-2515-8113
Keywords: reinforcement learning, robotic manipulation, active demonstration, hierarchical reinforcement learning
Issue Date: 8-Aug-2023
Citation: GUO CHAOQUN (2023-08-08). HIERARCHICAL REINFORCEMENT LEARNING WITH PARAMETERIZED OPTIONS FOR LONG-HORIZON ROBOTIC MANIPULATION. ScholarBank@NUS Repository.
Abstract: Hierarchical Reinforcement Learning (HRL) is a promising approach for addressing long-horizon robotic manipulation tasks with sparse rewards. In the parameterized options framework of HRL, a high-level policy selects a skill and its corresponding low-level goal parameters from a pre-trained skill library, allowing shared skills across tasks. However, fixed skills can lead to poor performance when skills fail to generalize. This work introduces a novel hierarchical algorithm for joint training of two-level policies in the parameterized options framework under sparse reward settings. Three key contributions are made in this thesis. First, a skill library is developed using off-the-shelf RL algorithms for quick learning of simple actions, emphasizing the importance of joint policy training for skill generalization across tasks. Second, the thesis presents a novel hierarchical architecture, Hier-P-DQN, and incorporates high-level active demonstration to ensure stable learning. Lastly, staged sparse rewards and high-level hindsight experience replay (HER) are used to expedite learning. Through extensive experimentation, Hier-P-DQN outperforms baseline methods like DDPG+HER and Behavioral cloning in long-horizon robotic manipulation tasks with sparse rewards. It achieves impressive performance with significantly fewer environment interactions, requiring only 1e4o 1.5e4 episodes, much less than traditional RL methods. Additionally, obtaining high-level demonstrations is easier compared to traditional approaches.
URI: https://scholarbank.nus.edu.sg/handle/10635/246253
Appears in Collections:Master's Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
GuoChaoqun.pdf10.41 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.