Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/244765
Title: | TOWARDS HUMAN-CENTRIC AI: INVERSE REINFORCEMENT LEARNING MEETS ALGORITHMIC FAIRNESS | Authors: | SREEJITH BALAKRISHNAN | ORCID iD: | orcid.org/0000-0003-0341-4834 | Keywords: | Human-centric AI, Value Alignment, Algorithmic Fairness, Inverse Reinforcement Learning, Bayesian optimization, Reinforcement Learning | Issue Date: | 22-Feb-2023 | Citation: | SREEJITH BALAKRISHNAN (2023-02-22). TOWARDS HUMAN-CENTRIC AI: INVERSE REINFORCEMENT LEARNING MEETS ALGORITHMIC FAIRNESS. ScholarBank@NUS Repository. | Abstract: | This thesis focuses on two aspects of human-centric AI - value alignment and fairness. Our first work explores Inverse reinforcement learning (IRL), a potential solution to value alignment. We introduce BO-IRL, an IRL algorithm that uses a novel kernel to explore the reward function space efficiently. The second work introduces SCALES, a framework that translates various fairness principles into fair decisions by translating them to a combination of utility, non-causal and causal components which are, in turn, mapped to elements of a Constrained Markov Decision Process. Our final work unifies the concepts of value alignment and fairness by extending the IRL problem to scenarios where the expert agent is fairness abiding. We propose FAIR-BOIRL, a new BO-based IRL algorithm that searches for solutions across both the reward function space and fairness principles. FAIR-BOIRL uses IM-GPTS, a novel acquisition function that uses an implicit multi-arm bandit strategy, to perform an efficient search. | URI: | https://scholarbank.nus.edu.sg/handle/10635/244765 |
Appears in Collections: | Ph.D Theses (Open) |
Show full item record
Files in This Item:
File | Description | Size | Format | Access Settings | Version | |
---|---|---|---|---|---|---|
BalakrishnanSreejith.pdf | 15.65 MB | Adobe PDF | OPEN | None | View/Download |
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.