Please use this identifier to cite or link to this item: https://doi.org/10.1186/1471-2105-9-550
DC FieldValue
dc.titlePathological rate matrices: From primates to pathogens
dc.contributor.authorSchranz, H.W.
dc.contributor.authorYap, B.V.
dc.contributor.authorEasteal, S.
dc.contributor.authorKnight, R.
dc.contributor.authorHuttley, G.A.
dc.date.accessioned2014-10-28T05:14:14Z
dc.date.available2014-10-28T05:14:14Z
dc.date.issued2008-12-19
dc.identifier.citationSchranz, H.W., Yap, B.V., Easteal, S., Knight, R., Huttley, G.A. (2008-12-19). Pathological rate matrices: From primates to pathogens. BMC Bioinformatics 9 : -. ScholarBank@NUS Repository. https://doi.org/10.1186/1471-2105-9-550
dc.identifier.issn14712105
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/105295
dc.description.abstractBackground: Continuous-time Markov models allow flexible, parametrically succinct descriptions of sequence divergence. Non-reversible forms of these models are more biologically realistic but are challenging to develop. The instantaneous rate matrices defined for these models are typically transformed into substitution probability matrices using a matrix exponentiation algorithm that employs eigendecomposition, but this algorithm has characteristic vulnerabilities that lead to significant errors when a rate matrix possesses certain 'pathological' properties. Here we tested whether pathological rate matrices exist in nature, and consider the suitability of different algorithms to their computation. Results: We used concatenated protein coding gene alignments from microbial genomes, primate genomes and independent intron alignments from primate genomes. The Taylor series expansion and eigendecomposition matrix exponentiation algorithms were compared to the less widely employed, but more robust, Padé with scaling and squaring algorithm for nucleotide, dinucleotide, codon and trinucleotide rate matrices. Pathological dinucleotide and trinucleotide matrices were evident in the microbial data set, affecting the eigendecomposition and Taylor algorithms respectively. Even using a conservative estimate of matrix error (occurrence of an invalid probability), both Taylor and eigendecomposition algorithms exhibited substantial error rates: ∼100% of all exonic trinucleotide matrices were pathological to the Taylor algorithm while ∼10% of codon positions 1 and 2 dinucleotide matrices and intronic trinucleotide matrices, and ∼30% of codon matrices were pathological to eigendecomposition. The majority of Taylor algorithm errors derived from occurrence of multiple unobserved states. A small number of negative probabilities were detected from the Padé algorithm on trinucleotide matrices that were attributable to machine precision. Although the Padé algorithm does not facilitate caching of intermediate results, it was up to 3× faster than eigendecomposition on the same matrices. Conclusion: Development of robust software for computing non-reversible dinucleotide, codon and higher evolutionary models requires implementation of the Padé with scaling and squaring algorithm. © 2008 Schranz et al; licensee BioMed Central Ltd.
dc.description.urihttp://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1186/1471-2105-9-550
dc.sourceScopus
dc.typeArticle
dc.contributor.departmentSTATISTICS & APPLIED PROBABILITY
dc.description.doi10.1186/1471-2105-9-550
dc.description.sourcetitleBMC Bioinformatics
dc.description.volume9
dc.description.page-
dc.description.codenBBMIC
dc.identifier.isiut000263974000002
Appears in Collections:Staff Publications
Elements

Show simple item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
2008-pathological_rate_matrices_primates_pathogens-pub.pdf625.97 kBAdobe PDF

OPEN

PublishedView/Download

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.