Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/182565
Title: REVISED APPROACH FOR RISK-AVERSE MULTI-ARMED BANDITS UNDER CVAR CRITERIA
Authors: NAJAKORN KHAJONCHOTPANYA
Keywords: Multi-armed bandits, Online learning, Upper confidence bound, Risk awareness, Risk aversion, Conditional value at risk
Issue Date: 8-Jul-2020
Citation: NAJAKORN KHAJONCHOTPANYA (2020-07-08). REVISED APPROACH FOR RISK-AVERSE MULTI-ARMED BANDITS UNDER CVAR CRITERIA. ScholarBank@NUS Repository.
Abstract: Multi-armed bandits (MAB) is a well-known online learning framework for balancing the trade-off between exploration and exploitation inherent in sequential decision problems. In the classical MAB setting, a metric for measuring the performance is a sample mean of the actualised rewards, which considered a risk-neutral objective. However, various applications, e.g., clinical trials, finance, a risk-sensitive objective is more desired. Thus, this thesis incorporates conditional value at risk, which is a widely-used risk measure, into the MAB problems. Particularly, this thesis proposes a new variant of the upper confidence bound algorithm, and establishes its regret bounds with respect to different regret notions proposed in the risk-averse MAB literature. Finally, this thesis conducts a theoretical analysis and a numerical experiment comparing the proposed algorithm’s performance with the other state-of-the-art algorithms, and concludes that the proposed algorithm performs competitively against the other state-of-the-art algorithms.
URI: https://scholarbank.nus.edu.sg/handle/10635/182565
Appears in Collections:Master's Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
KhajonchotpanyaN.pdf713.43 kBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.