Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/164856
Title: STATISTICAL MODELLING OF RNA-SEQ DATA AND DENSITY ESTIMATION IN FINITE MIXTURE MODELS
Authors: LIU SIYUN
Keywords: RNA-Seq data, Non-uniformity, Density estimator, Finite mixture models, Expectation-maximisation algorithm, Bayesian information criterion
Issue Date: 15-Oct-2019
Citation: LIU SIYUN (2019-10-15). STATISTICAL MODELLING OF RNA-SEQ DATA AND DENSITY ESTIMATION IN FINITE MIXTURE MODELS. ScholarBank@NUS Repository.
Abstract: In this thesis, we study two statistical research topics. We propose a zero-inflated mixture Poisson linear model for RNA-Seq data; it integrates zero-inflation and mixture pattern commonly observed for the read counts, together with multiple sequencing preferences to capture non-uniformity of the read counts arising from sequencing bias into a unified framework. We also propose a modified kernel density estimator for unknown component densities in finite mixture models with known mixture proportions. Observations determine points where Gaussian kernels are placed at, as well as point-specific bandwidths and weights associated with the kernels. Our proposed estimator inherits all properties of a probability density function and can handle observations with discrete or continuous mixture proportions. We derive algorithms based on expectation-maximisation algorithm to fit both proposed methods and procedures based on Bayesian information criterion to choose their number of mixture components. Results from simulation studies and real data analysis show promising performances.
URI: https://scholarbank.nus.edu.sg/handle/10635/164856
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
LiuS.pdf4.88 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.