Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/246253
DC FieldValue
dc.titleHIERARCHICAL REINFORCEMENT LEARNING WITH PARAMETERIZED OPTIONS FOR LONG-HORIZON ROBOTIC MANIPULATION
dc.contributor.authorGUO CHAOQUN
dc.date.accessioned2023-11-30T18:00:35Z
dc.date.available2023-11-30T18:00:35Z
dc.date.issued2023-08-08
dc.identifier.citationGUO CHAOQUN (2023-08-08). HIERARCHICAL REINFORCEMENT LEARNING WITH PARAMETERIZED OPTIONS FOR LONG-HORIZON ROBOTIC MANIPULATION. ScholarBank@NUS Repository.
dc.identifier.urihttps://scholarbank.nus.edu.sg/handle/10635/246253
dc.description.abstractHierarchical Reinforcement Learning (HRL) is a promising approach for addressing long-horizon robotic manipulation tasks with sparse rewards. In the parameterized options framework of HRL, a high-level policy selects a skill and its corresponding low-level goal parameters from a pre-trained skill library, allowing shared skills across tasks. However, fixed skills can lead to poor performance when skills fail to generalize. This work introduces a novel hierarchical algorithm for joint training of two-level policies in the parameterized options framework under sparse reward settings. Three key contributions are made in this thesis. First, a skill library is developed using off-the-shelf RL algorithms for quick learning of simple actions, emphasizing the importance of joint policy training for skill generalization across tasks. Second, the thesis presents a novel hierarchical architecture, Hier-P-DQN, and incorporates high-level active demonstration to ensure stable learning. Lastly, staged sparse rewards and high-level hindsight experience replay (HER) are used to expedite learning. Through extensive experimentation, Hier-P-DQN outperforms baseline methods like DDPG+HER and Behavioral cloning in long-horizon robotic manipulation tasks with sparse rewards. It achieves impressive performance with significantly fewer environment interactions, requiring only 1e4o 1.5e4 episodes, much less than traditional RL methods. Additionally, obtaining high-level demonstrations is easier compared to traditional approaches.
dc.language.isoen
dc.subjectreinforcement learning, robotic manipulation, active demonstration, hierarchical reinforcement learning
dc.typeThesis
dc.contributor.departmentMECHANICAL ENGINEERING
dc.contributor.supervisorChee Meng Chew
dc.description.degreeMaster's
dc.description.degreeconferredMASTER OF ENGINEERING (CDE)
dc.identifier.orcid0009-0006-2515-8113
Appears in Collections:Master's Theses (Open)

Show simple item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
GuoChaoqun.pdf10.41 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.