Please use this identifier to cite or link to this item: https://doi.org/10.1016/j.ympev.2019.01.012
Title: Effects of missing data and data type on phylotranscriptomic analysis of stony corals (Cnidaria: Anthozoa: Scleractinia)
Authors: QUEK ZHENG BIN, RANDOLPH 
Huang Danwei 
Issue Date: 22-Jan-2019
Publisher: Elsevier
Citation: QUEK ZHENG BIN, RANDOLPH, Huang Danwei (2019-01-22). Effects of missing data and data type on phylotranscriptomic analysis of stony corals (Cnidaria: Anthozoa: Scleractinia). Molecular Phylogenetics and Evolution 134 : 12-23. ScholarBank@NUS Repository. https://doi.org/10.1016/j.ympev.2019.01.012
Rights: Attribution-NonCommercial-NoDerivatives 4.0 International
Abstract: Across the tree of life, phylogenetic analysis is increasingly being performed using transcriptome data. As a result of heterogeneous gene expression within individual organisms and unequal sequencing depth between samples, coverage of homologous loci in such datasets is typically inhomogeneous. Consequently, missing data are a common feature of phylotranscriptomic inference, but their impact on phylogenetic analysis remains poorly characterised empirically. Considering the complexity of the evolutionary history of stony corals (Cnidaria: Anthozoa: Scleractinia), transcriptome data hold great promise for resolving their phylogeny, particularly if there is a good understanding of missing data and data type (either amino acid or DNA) effects. Here, we reconstructed a broad phylogenetic tree of 39 scleractinian species with 3 corallimorpharians as outgroups, including 15 transcriptomes that were newly sequenced and assembled in this study. Between 63 and 505 loci were used to analyse the scleractinian phylogeny, and we quantified differences in tree topology, tree shape, bootstrap support and effects of conflicting gene trees among datasets of varying completeness for both amino acid and DNA sequences. Even with almost 70% missing data, tree topologies appear to be mostly unaffected, although there are higher incongruence levels in the less complete datasets. Furthermore, DNA trees outperform amino acid trees in bootstrap support and robustness against incongruent loci. Overall, our findings indicate that high levels of missing data can still produce expected tree topologies, but identifying and omitting incongruent loci can lead to more consistent branch length estimates.
Source Title: Molecular Phylogenetics and Evolution
URI: https://scholarbank.nus.edu.sg/handle/10635/168214
ISSN: 1055-7903
DOI: 10.1016/j.ympev.2019.01.012
Rights: Attribution-NonCommercial-NoDerivatives 4.0 International
Appears in Collections:Staff Publications
Elements

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
Quek2019.pdf744.61 kBAdobe PDF

OPEN

Post-printView/Download

SCOPUSTM   
Citations

3
checked on Jun 4, 2020

Page view(s)

14
checked on Jun 4, 2020

Download(s)

1
checked on Jun 4, 2020

Google ScholarTM

Check

Altmetric


This item is licensed under a Creative Commons License Creative Commons