Please use this identifier to cite or link to this item:
Title: Stochastic optimal control and performance analysis of wireless ad hoc networks
Keywords: Optimal control, Markov Decision Process, Neuro-Dynamic Programming, Reinforcement Learning, Performance Analysis, Stability, Wireless Ad Hoc Networks
Issue Date: 17-Aug-2007
Citation: DANIEL BALAGUER YAGAN (2007-08-17). Stochastic optimal control and performance analysis of wireless ad hoc networks. ScholarBank@NUS Repository.
Abstract: In this thesis, we consider a stochastic optimal control approach for resource allocation and provisioning in wireless ad hoc networks. Specifically, we study the problem of multi-class network scheduling under a time-varying channel and topology. We formulate the problem using the decision theoretic approach known as Markov Decision Process (MDP). We present four variants of MDP formulations to highlight the results and contribution.The first model uses the theory of I?-irreducibility for controlled Markov chains, with the objective of minimizing the average congestion level. Using stability concepts of I?-irreducible chains, we present the first novel method of achieving optimization and stability conditions simultaneously for a general Markov queuing network, and for deriving performance bounds directly from the control algorithm. The second model formulates a Semi-Markov Decision Process (SMDP) where each agent performs network-level bandwidth allocation and buffer management. We then use a model-free approach known as Neuro-Dynamic Programming that uses simulation-based techniques to find near-optimal policies. The third model uses the framework known as Hierarchical Semi-Markov Decision Process (HSMDP) that decomposes the original SMDP provisioning problem into a task structure. A model-free Hierarchical Reinforcement Learning (HRL) technique is then used to achieve faster convergence and better performance by reusing subtasks solutions. Finally, we formulate the scheduling problem as a Decentralized Partially Observable Markov Decision Process (DEC-POMDP) which is a multi-agent extension of MDP for decentralized control where each agent observes different partial views of the global network condition. We also address the issue of locality of interaction among neighboring nodes and apply a model-free solution. In addition, we present the first method of analyzing the stability and optimization of a decentralized Markov network and derive performance bounds as the algorithm converges.
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
print-finalthesisJuly2007.pdf1.67 MBAdobe PDF



Page view(s)

checked on Apr 19, 2019


checked on Apr 19, 2019

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.