Please use this identifier to cite or link to this item:
https://scholarbank.nus.edu.sg/handle/10635/111175
Title: | From gene trees to species trees | Authors: | Ma, B. Li, M. Zhang, L. |
Keywords: | Algorithms Gene trees NP-hardness Species trees |
Issue Date: | 2000 | Citation: | Ma, B.,Li, M.,Zhang, L. (2000). From gene trees to species trees. SIAM Journal on Computing 30 (3) : 729-752. ScholarBank@NUS Repository. | Abstract: | This paper studies various algorithmic issues in reconstructing a species tree from gene trees under the duplication and the mutation cost model. This is a fundamental problem in computational molecular biology. Our main results are as follows. 1. A linear time algorithm is presented for computing all the losses in duplications associated with the least common ancestor mapping from a gene tree to a species tree. This answers a problem raised recently by Eulenstein, Mirkin, and Vingron [J. Comput. Bio., 5 (1998), pp. 135-148]. 2. The complexity of finding an optimal species tree from gene trees is studied. The problem is proved to be NP-hard for the duplication cost and for the mutation cost. Further, the concept of reconciled trees was introduced by Goodman et al. and formalized by Page for visualizing the relationship between gene and species trees. We show that constructing an optimal reconciled tree for gene trees is also NP-hard. Finally, we consider a general reconstruction problem and show it to be NP-hard even for the well-known nearest neighbor interchange distance. 3. A new and efficiently computable metric is defined based on the duplication cost. We show that the problem of finding an optimal species tree from gene trees is NP-hard under this new metric but it can be approximated within factor 2 in polynomial time. Using this approximation result, we propose a heuristic method for finding a species tree from gene trees with uniquely labeled leaves under the duplication cost. Our experimental tests demonstrate that when the number of species is larger than 15 and gene trees are close to each other, our heuristic method is significantly better than the existing program in Page's GeneTree 1.0 that starts the search from a random tree. © 2000 Society for Industrial and Applied Mathematics. | Source Title: | SIAM Journal on Computing | URI: | http://scholarbank.nus.edu.sg/handle/10635/111175 | ISSN: | 00975397 |
Appears in Collections: | Staff Publications |
Show full item record
Files in This Item:
There are no files associated with this item.
Google ScholarTM
Check
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.