Sparsely correlated hidden Markov models with application to genome-wide location studies

Please use this identifier to cite or link to this item: https://doi.org/10.1093/bioinformatics/btt012

DC Field	Value
dc.title	Sparsely correlated hidden Markov models with application to genome-wide location studies
dc.contributor.author	Choi, H.
dc.contributor.author	Fermin, D.
dc.contributor.author	Nesvizhskii, A.I.
dc.contributor.author	Ghosh, D.
dc.contributor.author	Qin, Z.S.
dc.date.accessioned	2014-11-26T05:05:08Z
dc.date.available	2014-11-26T05:05:08Z
dc.date.issued	2013-03
dc.identifier.citation	Choi, H., Fermin, D., Nesvizhskii, A.I., Ghosh, D., Qin, Z.S. (2013-03). Sparsely correlated hidden Markov models with application to genome-wide location studies. Bioinformatics 29 (5) : 533-541. ScholarBank@NUS Repository. https://doi.org/10.1093/bioinformatics/btt012
dc.identifier.issn	13674803
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/109060
dc.description.abstract	Motivation: Multiply correlated datasets have become increasingly common in genome-wide location analysis of regulatory proteins and epigenetic modifications. Their correlation can be directly incorporated into a statistical model to capture underlying biological interactions, but such modeling quickly becomes computationally intractable.Results: We present sparsely correlated hidden Markov models (scHMM), a novel method for performing simultaneous hidden Markov model (HMM) inference for multiple genomic datasets. In scHMM, a single HMM is assumed for each series, but the transition probability in each series depends on not only its own hidden states but also the hidden states of other related series. For each series, scHMM uses penalized regression to select a subset of the other data series and estimate their effects on the odds of each transition in the given series. Following this, hidden states are inferred using a standard forward-backward algorithm, with the transition probabilities adjusted by the model at each position, which helps retain the order of computation close to fitting independent HMMs (iHMM). Hence, scHMM is a collection of inter-dependent non-homogeneous HMMs, capable of giving a close approximation to a fully multivariate HMM fit. A simulation study shows that scHMM achieves comparable sensitivity to the multivariate HMM fit at a much lower computational cost. The method was demonstrated in the joint analysis of 39 histone modifications, CTCF and RNA polymerase II in human CD4+ T cells. scHMM reported fewer high-confidence regions than iHMM in this dataset, but scHMM could recover previously characterized histone modifications in relevant genomic regions better than iHMM. In addition, the resulting combinatorial patterns from scHMM could be better mapped to the 51 states reported by the multivariate HMM method of Ernst and Kellis. © 2013 The Author Published by Oxford University Press. All rights reserved.
dc.source	Scopus
dc.type	Article
dc.contributor.department	SAW SWEE HOCK SCHOOL OF PUBLIC HEALTH
dc.description.doi	10.1093/bioinformatics/btt012
dc.description.sourcetitle	Bioinformatics
dc.description.volume	29
dc.description.issue	5
dc.description.page	533-541
dc.description.coden	BOINF
dc.identifier.isiut	000315623000001
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM