Please use this identifier to cite or link to this item: https://doi.org/10.1186/1471-2105-11-152
Title: Alignment and clustering of phylogenetic markers - implications for microbial diversity studies
Authors: White, J.R
Navlakha, S
Nagarajan, N 
Ghodsi, M
Kingsford, C
Pop, M
Keywords: Algorithm parameters
Bacterial community
Critical assessment
High-throughput sequencing
Microbial diversity
Operational taxonomic units
Semi-supervised Clustering
Taxonomic composition
Clustering algorithms
RNA
Bacteria (microorganisms)
Otus
bacterial DNA
bacterial RNA
RNA 16S
article
cluster analysis
genetic marker
genetic variability
genetics
phylogeny
sequence alignment
Cluster Analysis
DNA, Bacterial
Genetic Markers
Genetic Variation
Phylogeny
RNA, Bacterial
RNA, Ribosomal, 16S
Sequence Alignment
Issue Date: 2010
Citation: White, J.R, Navlakha, S, Nagarajan, N, Ghodsi, M, Kingsford, C, Pop, M (2010). Alignment and clustering of phylogenetic markers - implications for microbial diversity studies. BMC Bioinformatics 11 : 152. ScholarBank@NUS Repository. https://doi.org/10.1186/1471-2105-11-152
Rights: Attribution 4.0 International
Abstract: Background: Molecular studies of microbial diversity have provided many insights into the bacterial communities inhabiting the human body and the environment. A common first step in such studies is a survey of conserved marker genes (primarily 16S rRNA) to characterize the taxonomic composition and diversity of these communities. To date, however, there exists significant variability in analysis methods employed in these studies.Results: Here we provide a critical assessment of current analysis methodologies that cluster sequences into operational taxonomic units (OTUs) and demonstrate that small changes in algorithm parameters can lead to significantly varying results. Our analysis provides strong evidence that the species-level diversity estimates produced using common OTU methodologies are inflated due to overly stringent parameter choices. We further describe an example of how semi-supervised clustering can produce OTUs that are more robust to changes in algorithm parameters.Conclusions: Our results highlight the need for systematic and open evaluation of data analysis methodologies, especially as targeted 16S rRNA diversity studies are increasingly relying on high-throughput sequencing technologies. All data and results from our study are available through the JGI FAMeS website http://fames.jgi-psf.org/. © 2010 White et al; licensee BioMed Central Ltd.
Source Title: BMC Bioinformatics
URI: https://scholarbank.nus.edu.sg/handle/10635/181675
ISSN: 14712105
DOI: 10.1186/1471-2105-11-152
Rights: Attribution 4.0 International
Appears in Collections:Elements
Staff Publications

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
10_1186_1471-2105-11-152.pdf2.07 MBAdobe PDF

OPEN

NoneView/Download

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons