ScholarBank@NUShttps://scholarbank.nus.edu.sgThe DSpace digital repository system captures, stores, indexes, preserves, and distributes digital research material.Sun, 20 Sep 2020 13:18:35 GMT2020-09-20T13:18:35Z50101- Structural properties of the reconciliation space and their applications in enumerating nearly-optimal reconciliations between a gene tree and a species treehttps://scholarbank.nus.edu.sg/handle/10635/104208Title: Structural properties of the reconciliation space and their applications in enumerating nearly-optimal reconciliations between a gene tree and a species tree
Authors: Wu, T.; Zhang, L.
Abstract: Introduction: A gene tree for a gene family is often discordant with the containing species tree because of its complex evolutionary course during which gene duplication, gene loss and incomplete lineage sorting events might occur. Hence, it is of great challenge to infer the containing species tree from a set of gene trees. One common approach to this inference problem is through gene tree and species tree reconciliation.Results: In this paper, we generalize the traditional least common ancestor (LCA) reconciliation to define a reconciliation between a gene tree and species tree under the tree homomorphism framework. We then study the structural properties of the space of all reconciliations between a gene tree and a species tree in terms of the gene duplication, gene loss or deep coalescence costs. As application, we show that the LCA reconciliation is the unique one that has the minimum deep coalescence cost, provide a novel characterization of the reconciliations with the optimal duplication cost, and present efficient algorithms for enumerating (nearly-)optimal reconciliations with respect to each cost.Conclusions: This work provides a new graph-theoretic framework for studying gene tree and species tree reconciliations. © 2011 Wu and Zhang; licensee BioMed Central Ltd.
Wed, 05 Oct 2011 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1042082011-10-05T00:00:00Z
- Optimal realizations of two-dimensional, totally-decomposable metricshttps://scholarbank.nus.edu.sg/handle/10635/128013Title: Optimal realizations of two-dimensional, totally-decomposable metrics
Authors: Herrmann, Sven; Koolen, Jack H.; Lesser Alic; Moulton Vincent; Wu, Taoyang
Thu, 01 Jan 2015 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1280132015-01-01T00:00:00Z
- The size of 3-compatible, weakly compatible split systemshttps://scholarbank.nus.edu.sg/handle/10635/104354Title: The size of 3-compatible, weakly compatible split systems
Authors: Grünewald, S.; Koolen, J.H.; Moulton, V.; Wu, T.
Abstract: A split system on a finite set X is a set of bipartitions of X. Weakly compatible and k-compatible (k≥1) split systems are split systems which satisfy special restrictions on all subsets of a certain fixed size. They arise in various areas of applied mathematics such as phylogenetics and multi-commodity flow theory. In this note, we show that the number of splits in a 3-compatible, weakly compatible split system on a set X of size n is linear in n. © 2012 Korean Society for Computational and Applied Mathematics.
Mon, 01 Oct 2012 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1043542012-10-01T00:00:00Z
- Obtaining splits from cut sets of tight spanshttps://scholarbank.nus.edu.sg/handle/10635/103663Title: Obtaining splits from cut sets of tight spans
Authors: Dress, A.; Moulton, V.; Spillner, A.; Wu, T.
Abstract: To any metric D on a finite set X, one can associate a metric space T(D) known as its tight span. Properties of T(D) often reveal salient properties of D. For example, cut sets of T(D), i.e., subsets of T(D) whose removal disconnect T(D), can help to identify clusters suggested by D and indicate how T(D) (and hence D) may be decomposed into simpler components. Given a bipartition or split S of X, we introduce in this paper a real-valued index ε(D,S) that comes about by considering cut sets of T(D). We also show that this index is intimately related to another, more easily computable index δ( D,S) whose definition does not directly depend on T(D). In addition, we provide an illustration for how these two new indices could help to extend and complement current distance-based methods for phylogenetic network construction such as split decomposition and NeighborNet. © © 2013 Elsevier B.V. All rights reserved.
Mon, 01 Jul 2013 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1036632013-07-01T00:00:00Z
- Degree distribution of large networks generated by the partial duplication modelhttps://scholarbank.nus.edu.sg/handle/10635/52862Title: Degree distribution of large networks generated by the partial duplication model
Authors: Li, S.; Choi, K.P.; Wu, T.
Abstract: In this paper, we present a rigorous analysis on the limiting behavior of the degree distribution of the partial duplication model, a random network growth model in the duplication and divergence family that is popular in the study of biological networks. We show that for each non-negative integer k, the expected proportion of nodes of degree k approaches a limit as the network becomes large. This fills in a gap in previous studies. In addition, we prove that p=1/2, where p is the selection probability of the model, is the phase transition for the expected proportion of isolated nodes converging to 1, and hence answer a question raised in Bebek et al. [G. Bebek, P. Berenbrink, C. Cooper, T. Friedetzky, J. Nadeau, S.C. Sahinalp, The degree distribution of the generalized duplication model, Theoret. Comput. Sci. 369 (2006) 239-249]. We also obtain asymptotic bounds on the convergence rates of degree distribution. Since the observed networks typically do not contain isolated nodes, we study the subgraph consisting of all non-isolated nodes contained in the networks generated by the partial duplication model, and show that p=1/2 is again a phase transition for the limiting behavior of its degree distribution.
Mon, 11 Mar 2013 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/528622013-03-11T00:00:00Z
- Degree distribution of large networks generated by the partial duplication modelhttps://scholarbank.nus.edu.sg/handle/10635/103118Title: Degree distribution of large networks generated by the partial duplication model
Authors: Li, S.; Choi, K.P.; Wu, T.
Abstract: In this paper, we present a rigorous analysis on the limiting behavior of the degree distribution of the partial duplication model, a random network growth model in the duplication and divergence family that is popular in the study of biological networks. We show that for each non-negative integer k, the expected proportion of nodes of degree k approaches a limit as the network becomes large. This fills in a gap in previous studies. In addition, we prove that p=1/2, where p is the selection probability of the model, is the phase transition for the expected proportion of isolated nodes converging to 1, and hence answer a question raised in Bebek et al. [G. Bebek, P. Berenbrink, C. Cooper, T. Friedetzky, J. Nadeau, S.C. Sahinalp, The degree distribution of the generalized duplication model, Theoret. Comput. Sci. 369 (2006) 239-249]. We also obtain asymptotic bounds on the convergence rates of degree distribution. Since the observed networks typically do not contain isolated nodes, we study the subgraph consisting of all non-isolated nodes contained in the networks generated by the partial duplication model, and show that p=1/2 is again a phase transition for the limiting behavior of its degree distribution.
Mon, 11 Mar 2013 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1031182013-03-11T00:00:00Z
- Injective optimal realizations of finite metric spaceshttps://scholarbank.nus.edu.sg/handle/10635/103429Title: Injective optimal realizations of finite metric spaces
Authors: Koolen, J.H.; Lesser, A.; Moulton, V.; Wu, T.
Abstract: A realization of a finite metric space (X,d) is a weighted graph (G,w) whose vertex set contains X such that the distances between the elements of X in G correspond to those given by d. Such a realization is called optimal if it has minimal total edge weight. Optimal realizations have applications in fields such as phylogenetics, psychology, compression software and internet tomography. Given an optimal realization (G,w) of (X,d), there always exist certain "proper" maps from the vertex set of G into the so-called tight span of d. In [A. Dress, Trees, tight extensions of metric spaces, and the cohomological dimension of certain groups: a note on combinatorial properties of metric spaces, Adv. Math. 53 (1984) 321402], Dress conjectured that any such map must be injective. Although this conjecture was recently disproven, in this paper we show that it is possible to characterize those optimal realizations (G,w) for which certain generalizations of proper mapsthat map the geometric realization of (G,w) into the tight span instead of its vertex setmust always be injective. We also prove that these "injective" optimal realizations always exist, and show how they may be constructed from non-injective ones. Ultimately it is hoped that these results will contribute towards developing new ways to compute optimal realizations from tight spans. © 2012 Elsevier B.V. All rights reserved.
Mon, 28 May 2012 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1034292012-05-28T00:00:00Z
- Reconstruction of network evolutionary history from extant network topology and duplication historyhttps://scholarbank.nus.edu.sg/handle/10635/53315Title: Reconstruction of network evolutionary history from extant network topology and duplication history
Authors: Li, S.; Choi, K.P.; Wu, T.; Zhang, L.
Abstract: Genome-wide protein-protein interaction (PPI) data are readily available thanks to recent breakthroughs in biotechnology. However, PPI networks of extant organisms are only snapshots of the network evolution. How to infer the whole evolution history becomes a challenging problem in computational biology. In this paper, we present a likelihood-based approach to inferring network evolution history from the topology of PPI networks and the duplication relationship among the paralogs. Simulations show that our approach outperforms the existing ones in terms of the accuracy of reconstruction. Moreover, the growth parameters of several real PPI networks estimated by our method are more consistent with the ones predicted in literature. © 2012 Springer-Verlag.
Sun, 01 Jan 2012 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/533152012-01-01T00:00:00Z
- Network model and efficient method for detecting relative duplications or horizontal gene transfershttps://scholarbank.nus.edu.sg/handle/10635/104593Title: Network model and efficient method for detecting relative duplications or horizontal gene transfers
Authors: Zhang, L.; Ng, Y.K.; Wu, T.; Zheng, Y.
Abstract: Background: Horizontal gene transfer and gene duplication are two significant forces behind genome evolution. As more and more well-supported examples of HGTs are being revealed, there is a growing awareness that HGT is more widespread than previously thought, occurring often not only within bacteria, but also between species remotely related such as bacteria and plants or plants and animals. Although a substantial number of genomic sequences are known, HGT inference remains challenging. Parsimony-based inferences of HGT events are typically NP-hard under the framework of gene tree and species tree comparison; it is even more timeconsuming if the maximum likelihood approach is used. The fact that gene tree and species tree incongruence can be further confounded by gene duplication and gene loss events motivates us to incorporate considerations for these events into our inference of HGT events. Similarly, it will be beneficial if known HGT events are considered in the inference of gene duplications and gene losses. © 2011 IEEE.
Sat, 01 Jan 2011 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1045932011-01-01T00:00:00Z
- Reconstruction of network evolutionary history from extant network topology and duplication historyhttps://scholarbank.nus.edu.sg/handle/10635/104618Title: Reconstruction of network evolutionary history from extant network topology and duplication history
Authors: Li, S.; Choi, K.P.; Wu, T.; Zhang, L.
Abstract: Genome-wide protein-protein interaction (PPI) data are readily available thanks to recent breakthroughs in biotechnology. However, PPI networks of extant organisms are only snapshots of the network evolution. How to infer the whole evolution history becomes a challenging problem in computational biology. In this paper, we present a likelihood-based approach to inferring network evolution history from the topology of PPI networks and the duplication relationship among the paralogs. Simulations show that our approach outperforms the existing ones in terms of the accuracy of reconstruction. Moreover, the growth parameters of several real PPI networks estimated by our method are more consistent with the ones predicted in literature. © 2012 Springer-Verlag.
Sun, 01 Jan 2012 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1046182012-01-01T00:00:00Z