Please use this identifier to cite or link to this item:
https://doi.org/10.1093/bioinformatics/btq492
DC Field | Value | |
---|---|---|
dc.title | A study of the efficiency of pooling in haplotype estimation | |
dc.contributor.author | Kuk, A.Y.C. | |
dc.contributor.author | Xu, J. | |
dc.contributor.author | Yang, Y. | |
dc.date.accessioned | 2014-10-28T05:09:49Z | |
dc.date.available | 2014-10-28T05:09:49Z | |
dc.date.issued | 2010-08-27 | |
dc.identifier.citation | Kuk, A.Y.C., Xu, J., Yang, Y. (2010-08-27). A study of the efficiency of pooling in haplotype estimation. Bioinformatics 26 (20) : 2556-2563. ScholarBank@NUS Repository. https://doi.org/10.1093/bioinformatics/btq492 | |
dc.identifier.issn | 13674803 | |
dc.identifier.uri | http://scholarbank.nus.edu.sg/handle/10635/104979 | |
dc.description.abstract | Motivation: It has been claimed in the literature that pooling DNA samples is efficient in estimating haplotype frequencies. There is, however, no theoretical justification based on calculation of statistical efficiency. In fact, the limited evidence given so far is based on simulation studies with small numbers of loci. With rapid advance in technology, it is of interest to see if pooling is still efficient when the number of loci increases. Methods: Instead of resorting to simulation studies, we make use of asymptotic statistical theory to perform exact calculation of the efficiency of pooling relative to no pooling in the estimation of haplotype frequencies. As an intermediate step, we use the log-linear formulation of the haplotype probabilities and derive the asymptotic variance-covariance matrix of the maximum likelihood estimators of the canonical parameters of the log-linear model. Results: Based on our calculations under linkage equilibrium, pooling can suffer huge loss in efficiency relative to no pooling when there are more than three independent loci and the alleles are not rare. Pooling works better for rare alleles. In particular, if all the minor allele frequencies are 0.05, pooling maintains an advantage over no pooling until the number of independent loci reaches 6. High linkage disequilibrium effectively reduces the number of independent loci by ruling out certain haplotypes from occurring. Similar calculations of efficiency for the case of no pooling justify the common belief that it is not worthwhile to use molecular methods to resolve the phase ambiguity of individual genotype data. © The Author 2010. Published by Oxford University Press. All rights reserved. | |
dc.source | Scopus | |
dc.type | Article | |
dc.contributor.department | STATISTICS & APPLIED PROBABILITY | |
dc.description.doi | 10.1093/bioinformatics/btq492 | |
dc.description.sourcetitle | Bioinformatics | |
dc.description.volume | 26 | |
dc.description.issue | 20 | |
dc.description.page | 2556-2563 | |
dc.description.coden | BOINF | |
dc.identifier.isiut | 000282749700008 | |
Appears in Collections: | Staff Publications |
Show simple item record
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.