Graphical models for interactive POMDPs: Representations and solutions | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://doi.org/10.1007/s10458-008-9064-7

Title:	Graphical models for interactive POMDPs: Representations and solutions
Authors:	Doshi, P. Zeng, Y. Chen, Q.
Keywords:	Interactive POMDPs Probabilistic graphical models Sequential multiagent decision making
Issue Date:	2009
Citation:	Doshi, P., Zeng, Y., Chen, Q. (2009). Graphical models for interactive POMDPs: Representations and solutions. Autonomous Agents and Multi-Agent Systems 18 (3) : 376-416. ScholarBank@NUS Repository. https://doi.org/10.1007/s10458-008-9064-7
Abstract:	We develop new graphical representations for the problem of sequential decision making in partially observable multiagent environments, as formalized by interactive partially observable Markov decision processes (I-POMDPs). The graphical models called interactive influence diagrams (I-IDs) and their dynamic counterparts, interactive dynamic influence diagrams (I-DIDs), seek to explicitly model the structure that is often present in real-world problems by decomposing the situation into chance and decision variables, and the dependencies between the variables. I-DIDs generalize DIDs, which may be viewed as graphical representations of POMDPs, to multiagent settings in the same way that I-POMDPs generalize POMDPs. I-DIDs may be used to compute the policy of an agent given its belief as the agent acts and observes in a setting that is populated by other interacting agents. Using several examples, we show how I-IDs and I-DIDs may be applied and demonstrate their usefulness. We also show how the models may be solved using the standard algorithms that are applicable to DIDs. Solving I-DIDs exactly involves knowing the solutions of possible models of the other agents. The space of models grows exponentially with the number of time steps. We present a method of solving I-DIDs approximately by limiting the number of other agents' candidate models at each time step to a constant. We do this by clustering models that are likely to be behaviorally equivalent and selecting a representative set from the clusters. We discuss the error bound of the approximation technique and demonstrate its empirical performance. © 2008 Springer Science+Business Media, LLC.
Source Title:	Autonomous Agents and Multi-Agent Systems
URI:	http://scholarbank.nus.edu.sg/handle/10635/39805
ISSN:	13872532
DOI:	10.1007/s10458-008-9064-7
Appears in Collections:	Staff Publications

Show full item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Altmetric

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.