A STUDY AND ENHANCEMENTS OF REINFORCEMENT LEARNING METHODS | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/178992

DC Field	Value
dc.title	A STUDY AND ENHANCEMENTS OF REINFORCEMENT LEARNING METHODS
dc.contributor.author	LEE CHIN YEW
dc.date.accessioned	2020-10-22T05:31:31Z
dc.date.available	2020-10-22T05:31:31Z
dc.date.issued	1998
dc.identifier.citation	LEE CHIN YEW (1998). A STUDY AND ENHANCEMENTS OF REINFORCEMENT LEARNING METHODS. ScholarBank@NUS Repository.
dc.identifier.uri	https://scholarbank.nus.edu.sg/handle/10635/178992
dc.description.abstract	In recent years, Reinforcement Learning (RL) has become one of the most actively studied area in machine learning with widespread interests in applying these methods to solving difficult control, optimization and scheduling problems. Many of these solutions make use of powerful generalizing function approximators where there are no guarantee of convergence or learning rates but they proved to be effective. Although the field of RL is well developed for the case of discrete tabular representations, many of these methods in its current form do not extend naturally to the use of function approximators. On the other hand, to solve large scale problems we need generalizing architectures in order to have a tractable solution. Effectiveness of this combination of RL with function approximators has been demonstrated in many applications. In this thesis we shall propose two simple ideas to be incorporated into RL methods using function approximators. The first approach involves the use of multi-steps backup to train the evaluation function. The next approach investigates the usefulness of initiating training cycles in states that have incurred large learning errors. We present arguments as to how such simple ideas may help to improve learning times and convergence rates. We also relate these ideas to several approaches that have been extensively studied in learning methods using discrete tabular representations. Experiments are conducted to evaluate the utility of these enhancements on three learning problems. They are the pole-cart task, mountain car task and the game of pursuit and evasion. We present better and stronger learning results on these problems using the proposed ideas. Lastly self training is used to train a pair of predator and prey and we present some of the interesting strategies learned by the agents.
dc.source	CCK BATCHLOAD 20201023
dc.type	Thesis
dc.contributor.department	INFORMATION SYSTEMS & COMPUTER SCIENCE
dc.contributor.supervisor	SUNG KAH KAY
dc.description.degree	Master's
dc.description.degreeconferred	MASTER OF SCIENCE
Appears in Collections:	Master's Theses (Restricted)

Show simple item record

Files in This Item:

File	Description	Size	Format	Access Settings	Version
B22107617.PDF		2.1 MB	Adobe PDF	RESTRICTED	None	Log In

Google Scholar^TM

Check

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.