ScholarBank@NUShttps://scholarbank.nus.edu.sgThe DSpace digital repository system captures, stores, indexes, preserves, and distributes digital research material.Sun, 21 Jul 2024 22:59:13 GMT2024-07-21T22:59:13Z50171- Inferring phylogenetic relationships avoiding Forbidden rooted tripletshttps://scholarbank.nus.edu.sg/handle/10635/41176Title: Inferring phylogenetic relationships avoiding Forbidden rooted triplets
Authors: He, Y.-J.; Huynh, T.N.D.; Jansson, J.; Sung, W.-K.
Abstract: To construct a phylogenetic tree or phylogenetic network for describing the evolutionary history of a set of species is a well-studied problem in computational biology. One previously proposed method to infer a phylogenetic tree/network for a large set of species is by merging a collection of known smaller phylogenetic trees on overlapping sets of species so that no (or as little as possible) branching information is lost. However, little work has been done so far on inferring a phylogenetic tree/network from a specified set of trees when in addition, certain evolutionary relationships among the species are known to be highly unlikely. In this paper, we consider the problem of constructing a phylogenetic tree/network which is consistent with all of the rooted triplets in a given set T and none of the rooted triplets in another given set F. Although NP-hard in the general case, we provide some efficient exact and approximation algorithms for a number of biologically meaningful variants of the problem.
Sat, 01 Jan 2005 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/411762005-01-01T00:00:00Z
- Algorithms for combining rooted triplets into a galled phylogenetic networkhttps://scholarbank.nus.edu.sg/handle/10635/41177Title: Algorithms for combining rooted triplets into a galled phylogenetic network
Authors: Jansson, J.; Nguyen, N.B.; Sung, W.-K.
Abstract: This paper considers the problem of determining whether a given set T of rooted triplets can be merged without conflicts into a galled phylogenetic network, and if so, constructing such a network. When the input T is dense, we solve the problem in O(|T|) time, which is optimal since the size of the input is Θ(|T|). In comparison, the previously fastest algorithm for this problem runs in O(|T| 2) time. Next, we prove that the problem becomes NP-hard if extended to non-dense inputs, even for the special case of simple phylogenetic networks. We also show that for every positive integer n, there exists some set T of rooted triplets on n leaves such that any galled network can be consistent with at most 0.4883· |T| of the rooted triplets in T. On the other hand, we provide a polynomial-time approximation algorithm that always outputs a galled network consistent with at least a factor of 5/12 (> 0.4166) of the rooted triplets in T.
Sat, 01 Jan 2005 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/411772005-01-01T00:00:00Z
- Inferring phylogenetic relationships avoiding forbidding rooted tripletshttps://scholarbank.nus.edu.sg/handle/10635/39126Title: Inferring phylogenetic relationships avoiding forbidding rooted triplets
Authors: He, Y.-J.; Huynh, T.N.D.; Jansson, J.; Sung, W.-K.
Abstract: To construct a phylogenetic tree or phylogenetic network for describing the evolutionary history of a set of species is a well-studied problem in computational biology. One previously proposed method to infer a phylogenetic tree/network for a large set of species is by merging a collection of known smaller phylogenetic trees on overlapping sets of species so that no (or as little as possible) branching information is lost. However, little work has been done so far on inferring a phylogenetic tree/network from a specified set of trees when in addition, certain evolutionary relationships among the species are known to be highly unlikely. In this paper, we consider the problem of constructing a phylogenetic tree/network which is consistent with all of the rooted triplets in a given set C and none of the rooted triplets in another given set F. Although NP-hard in the general case, we provide some efficient exact and approximation algorithms for a number of biologically meaningful variants of the problem. © Imperial College Press.
Sun, 01 Jan 2006 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/391262006-01-01T00:00:00Z
- Rooted maximum agreement supertreeshttps://scholarbank.nus.edu.sg/handle/10635/39122Title: Rooted maximum agreement supertrees
Authors: Jansson, J.; Ng, J.H.-K.; Sadakane, K.; Sung, W.-K.
Abstract: Given a set Τ of rooted, unordered trees, where each T i ∈ Τ is distinctly leaf-labeled by a set Λ(T i) and where the sets Λ(T i) may overlap, the maximum agreement supertree problem (MASP) is to construct a distinctly leaf-labeled tree Q with leaf set Λ(Q) ⊆ ∪T i ∈Τ Λ(T i) such that |Λ(Q)| is maximized and for each T i ∈ Τ, the topological restriction of T i to Λ(Q) is isomorphic to the topological restriction of Q to Λ(T i). Let n = |∪T i∈ΤΛ(T i)|, k = |Τ|, and D = maxT i∈Τ{deg(T i)}. We first show that MASP with k = 2 can be solved in O(√D n log(2n/D)) time, which is O(n log n) when D = O(1) and O(n 1.5) when D is unrestricted. We then present an algorithm for MASP with D = 2 whose running time is polynomial if k = O(1). On the other hand, we prove that MASP is NP-hard for any fixed k ≥ 3 when D is unrestricted, and also NP-hard for any fixed D ≥ 2 when k is unrestricted even if each input tree is required to contain at most three leaves. Finally, we describe a polynomial-time (n/log n)-approximation algorithm for MASP. © Springer-Verlag 2004.
Thu, 01 Jan 2004 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/391222004-01-01T00:00:00Z
- Rooted maximum agreement supertreeshttps://scholarbank.nus.edu.sg/handle/10635/39450Title: Rooted maximum agreement supertrees
Authors: Jansson, J.; Ng, J.H.-K.; Sadakane, K.; Sung, W.-K.
Abstract: Given a set τ rooted, unordered trees, where each T i ∈ τ is distinctly leaf-labeled by a set Λ(T i) and where the sets Λ(T i) may overlap, the maximum agreement supertree problem(MASP) is to construct a distinctly leaf-labeled tree script Q sign with leaf set Λ(script Q sign) ⊆ ∪ Ti ∈ τ Λ(T i) such that |Λ(script Q sign)| is maximized and for each T i ∈ Τ, the topological restriction of T i to Λ(script Q sign) is isomorphic to the topological restriction of script Q sign to Λ(T i). Let n = |∪ Ti ∈ τ Λ(T i)|, k = |Τ|, and D = max Ti ∈ Τ {deg(T i)}. We first show that MASP with k = 2 can be solved in O(√Dn log (2n/D)) time, which is O(n log n) when D = O(1) and O(n 1.5) when D is unrestricted. We then present an algorithm for MASP with D = 2 whose running time is polynomial if k = O(1). On the other hand, we prove that MASP is NP-hard for any fixed k ≥ 3 when D is unrestricted, and also NP-hard for any fixed D ≥ 2 when k is unrestricted even if each input tree is required to contain at most three leaves. Finally, we describe a polynomial-time (n/log n)-approximation algorithm for MASP. © 2005 Springer Science+Business Media, Inc.
Sat, 01 Jan 2005 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/394502005-01-01T00:00:00Z
- Local gapped subforest alignment and its application in finding RNA structural motifshttps://scholarbank.nus.edu.sg/handle/10635/39448Title: Local gapped subforest alignment and its application in finding RNA structural motifs
Authors: Jansson, J.; Hieu, N.T.; Sung, W.-K.
Abstract: We consider the problem of computing an optimal local alignment of two labeled ordered forests F1 and F2 where ni and di, for i ∈{1, 2}, denote the number of nodes in Fi and the degree of Fi, respectively; and its applications in finding RNA structural motifs. A previous result is the local closed subforest alignment problem, which can be solved in O(n1n2d1d 2(d1 + d2)) time and O(n1n 2d1d2) space. This paper generalizes the concept of a closed subforest to a gapped subforest and then presents an algorithm for computing the optimal local gapped subforest alignment of F 1 and F2 in O(n1n2d 1d2(d1 + d2)) time and O(n 1n2d1d2) space. We show that our technique can improve the computation of the optimal local closed subforest alignment in O(n1n2(d1 + d2) 2) time and O(n1n2(d1 + d 2)) space. Furthermore, we prove that a special case of our local gapped subforest alignment problem is equivalent to a problem known in the literature as the local sequence-structure alignment problem (lssa). The previously best algorithm for lssa uses O(n1 2n 2 2(n1 + n2)) time and O(n 1n2) space; here, we show how to modify our main algorithm to obtain an algorithm for lssa running in O(n1n2(d 1 + d2)2) time and O(n1n 2(d1 + d2)) space. © Springer-Verlag 2004.
Thu, 01 Jan 2004 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/394482004-01-01T00:00:00Z
- A faster and more space-efficient algorithm for inferring arc-annotations of RNA sequences through alignmenthttps://scholarbank.nus.edu.sg/handle/10635/39453Title: A faster and more space-efficient algorithm for inferring arc-annotations of RNA sequences through alignment
Authors: Jansson, J.; Ng, S.-K.; Sung, W.-K.; Willy, H.
Abstract: The nested arc-annotation of a sequence is an important model used to represent structural information for RNA and protein sequences. Given two sequences S1 and S2 and a nested arc-annotation P 1 for S1, this paper considers the problem of inferring the nested arc-annotation P2 for S2 such that (S 1, P1) and (S2, P2) have the largest common substructure. The problem has a direct application in predicting the secondary structure of an RNA sequence given a closely related sequence with known secondary structure. The currently most efficient algorithm for this problem requires O(nm3) time and O(nm2) space where n is the length of the sequence with known arc-annotation and m is the length of the sequence whose arc-annotation is to be inferred. By using sparsification on a new recursive dynamic programming algorithm and applying a Hirschberg-like traceback technique with compression, we obtain an improved algorithm that runs in min{O(nm2 + n2m),O(nm2 log n), O(nm 3) time and min{O(m2 + mn), O(m2 log n + n)} space. © 2006 Springer Science+Business Media, Inc.
Sun, 01 Jan 2006 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/394532006-01-01T00:00:00Z
- Computing the maximum agreement of phylogenetic networkshttps://scholarbank.nus.edu.sg/handle/10635/39452Title: Computing the maximum agreement of phylogenetic networks
Authors: Choy, C.; Jansson, J.; Sadakane, K.; Sung, W.-K.
Abstract: We introduce the maximum agreement phylogenetic subnetwork problem (MASN) for finding branching structure shared by a set of phylogenetic networks. We prove that the problem is NP-hard even if restricted to three phylogenetic networks and give an O(n2)-time algorithm for the special case of two level-1 phylogenetic networks, where n is the number of leaves in the input networks and where N is called a level-f phylogenetic network if every biconnected component in the underlying undirected graph induces a subgraph of N containing at most f nodes with indegree 2. We also show how to extend our technique to yield a polynomial-time algorithm for any two level-f phylogenetic networks N1,N2 satisfying f=O(logn); more precisely, its running time is O(|V(N1)|·|V(N2)|·2f1+f2), where V(Ni) and fi denote the set of nodes in Ni and the level of Ni, respectively, for i∈{1,2}. © 2005 Elsevier B.V. All rights reserved.
Sat, 01 Jan 2005 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/394522005-01-01T00:00:00Z
- The maximum agreement of two nested phylogenetic networkshttps://scholarbank.nus.edu.sg/handle/10635/38919Title: The maximum agreement of two nested phylogenetic networks
Authors: Jansson, J.; Sung, W.-K.
Abstract: Given a set N of phylogenetic networks, the maximum agreement phylogenetic subnetwork problem (MASN) asks for a subnetwork contained in every Ni ∈ N with as many leaves as possible. MASN can be used to identify shared branching structure among phylogenetic networks or to measure their similarity. In this paper, we prove that the general case of MASN is NP-hard already for two phylogenetic networks, but that the problem can be solved efficiently if the two given phylogenetic networks exhibit a nested structure. We first show that the total number of nodes |V(N)| in any nested phylogenetic network N with n leaves and nesting depth d is O(n(d + 1)). We then describe an algorithm for testing if a given phylogenetic network is nested, and if so, determining its nesting depth in O(|V(N)| · (d + 1)) time. Next, we present a polynomial-time algorithm for MASN for two nested phylogenetic networks N 1, N2. Its running time is O(|V(N1)| · |V(N2)| · (d1 + 1) · (d2 + 1)), where d1 and d2 denote the nesting depths of N1 and N2, respectively. In contrast, the previously fastest algorithm for this problem runs in O(|V(N1)| · |V(N2)| 4 f) time, where f ≥ max{d1, d2}. © Springer-Verlag 2004.
Thu, 01 Jan 2004 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/389192004-01-01T00:00:00Z
- Algorithms for combining rooted triplets into a galled phylogenetic networkhttps://scholarbank.nus.edu.sg/handle/10635/39305Title: Algorithms for combining rooted triplets into a galled phylogenetic network
Authors: Jansson, J.; Nguyen, N.B.; Sung, W.-K.
Abstract: This paper considers the problem of determining whether a given set Τ of rooted triplets can be merged without conflicts into a galled phylogenetic network and, if so, constructing such a network. When the input Τ is dense, we solve the problem in O(|Τ|) time, which is optimal since the size of the input is Θ(|Τ|). In comparison, the previously fastest algorithm for this problem runs in O(|Τ|2) time. We also develop an optimal O(|Τ|)-time algorithm for enumerating all simple phylogenetic networks leaf-labeled by L that are consistent with Τ, where L is the set of leaf labels in Τ, which is used by our main algorithm. Next, we prove that the problem becomes NP-hard if extended to nondense inputs, even for the special case of simple phylogenetic networks. We also show that for every positive integer n, there exists some set Τ of rooted triplets on n leaves such that any galled network can be consistent with at most 0.4883 ·|Τ| of the rooted triplets in Τ. On the other hand, we provide a polynomial-time approximation algorithm that always outputs a galled network consistent with at least a factor of 5/12 (> 0.4166) of the rooted triplets in Τ. © 2006 Society for Industrial and Applied Mathematics.
Sun, 01 Jan 2006 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/393052006-01-01T00:00:00Z
- Inferring a level-1 phylogenetic network from a dense set of rooted tripletshttps://scholarbank.nus.edu.sg/handle/10635/39312Title: Inferring a level-1 phylogenetic network from a dense set of rooted triplets
Authors: Jansson, J.; Sung, W.-K.
Abstract: We consider the following problem: Given a set T of rooted triplets with leaf set L, determine whether there exists a phylogenetic network consistent with T, and if so, construct one. We show that if no restrictions are placed on the hybrid nodes in the solution, the problem is trivially solved in polynomial time by a simple sorting network-based construction. For the more interesting (and biologically more motivated) case where the solution is required to be a level-1 phylogenetic network, we present an algorithm solving the problem in O (| T |2) time when T is dense, i.e., when T contains at least one rooted triplet for each cardinality three subset of L. We also give an O (| T |5 / 3)-time algorithm for finding the set of all phylogenetic networks having a single hybrid node attached to exactly one leaf (and having no other hybrid nodes) that are consistent with a given dense set of rooted triplets. © 2006 Elsevier B.V. All rights reserved.
Sun, 01 Jan 2006 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/393122006-01-01T00:00:00Z
- A faster and more space-efficient algorithm for inferring Arc-annotations of RNA sequences through alignmenthttps://scholarbank.nus.edu.sg/handle/10635/39364Title: A faster and more space-efficient algorithm for inferring Arc-annotations of RNA sequences through alignment
Authors: Jansson, J.; Ng, S.-K.; Sung, W.-K.; Willy, H.
Abstract: This paper considers the problem of inferring the optimal nested arc-annotation of a sequence given another nested arc-annotated sequence by maximizing the weighted alignment between the bases and arcs in the two sequences. The problem has a direct application in predicting the secondary structure of an RNA sequence given a closely related sequence whose secondary structure is already known. The currently most efficient algorithm for this problem requires O(nm3) time and O(nm2) space where n is the length of the sequence with known arc-annotation while m is the length of the sequence to be inferred. We present an improved algorithm which runs in min{O(nm2 log n), O(nm3)} time and min{O(m2 +mn), O(m2 log n)} space. The time improvement is achieved by applying sparsification to the dynamic programming algorithm, while the space is reduced to a more practical quadratic complexity by using a Hirschberg-like traceback technique together with a simple compression. © Springer-Verlag 2004.
Thu, 01 Jan 2004 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/393642004-01-01T00:00:00Z
- Inferring a level-1 phylogenetic network from a dense set of rooted tripletshttps://scholarbank.nus.edu.sg/handle/10635/39370Title: Inferring a level-1 phylogenetic network from a dense set of rooted triplets
Authors: Jansson, J.; Sung, W.-K.
Abstract: Given a set T of rooted triplets with leaf set L, we consider the problem of determining whether there exists a phylogenetic network consistent with T, and if so, constructing one. If no restrictions are placed on the hybrid nodes in the solution, the problem is trivially solved in polynomial time by a simple sorting network-based construction. For the more interesting (and biologically more motivated) case where the solution is required to be a level-1 phylogenetic network, we present an algorithm solving the problem in O(n6) time when T is dense (i.e., contains at least one rooted triplet for each cardinality three subset of L), where n = |L|. Note that the size of the input is Θ(n3) if T is dense. We also give an O(n5)-time algorithm for finding the set of all phylogenetic networks having a single hybrid node attached to exactly one leaf (and having no other hybrid nodes) that are consistent with a given dense set of rooted triplets. © Springer-Verlag Berlin Heidelberg 2004.
Thu, 01 Jan 2004 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/393702004-01-01T00:00:00Z
- Computing the maximum agreement of phylogenetic networkshttps://scholarbank.nus.edu.sg/handle/10635/40073Title: Computing the maximum agreement of phylogenetic networks
Authors: Choy, C.; Jansson, J.; Sadakane, K.; Sung, W.-K.
Abstract: We introduce the maximum agreement phylogenetic subnetwork problem (MASN) of finding a branching structure shared by a set of phylogenetic networks. We prove that the problem is NP-hard even if restricted to three phylogenetic networks and give an O(n2)-time algorithm for the special case of two level-1 phylogenetic networks, where n is the number of leaves in the input networks and where N is called a level-f phylogenetic network if every biconnected component in the underlying undirected graph contains at most f nodes having indegree 2 in N. Our algorithm can be extended to yield a polynomial-time algorithm for two level-f phylogenetic networks N 1,N2 for any f which is upper-bounded by a constant; more precisely, its running time is O(|V(N1)|·|V(N 2)|·4f), where V(Ni) denotes the set of nodes of Ni. © 2004 Published by Elsevier B.V.
Thu, 01 Jan 2004 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/400732004-01-01T00:00:00Z
- Local gapped subforest alignment and its application in finding RNA structural motifshttps://scholarbank.nus.edu.sg/handle/10635/40391Title: Local gapped subforest alignment and its application in finding RNA structural motifs
Authors: Jansson, J.; Hieu, N.T.; Sung, W.-K.
Abstract: RNA molecules whose secondary structures contain similar substructures often have similar functions. Therefore, an important task in the study of RNA is to develop methods for discovering substructures in RNA secondary structures that occur frequently (also referred to as motifs). In this paper, we consider the problem of computing an optimal local alignment of two given labeled ordered forests F1 and F2. This problem asks for a substructure of F1 and a substructure of F2 that exhibit a high similarity. Since an RNA molecule's secondary structure can be represented as a labeled ordered forest, the problem we study has a direct application to finding potential motifs. We generalize the previously studied concept of a closed subforest to a gapped subforest and present the first algorithm for computing the optimal local gapped subforest alignment of F1 and F2. We also show that our technique can improve the time and space complexity of the previously most efficient algorithm for optimal local closed subforest alignment. Furthermore, we prove that a special case of our local gapped subforest alignment problem is equivalent to a problem known in the literature as the local sequence-structure alignment problem (lssa) and modify our main algorithm to obtain a much faster algorithm for lssa than the one previously proposed. An implementation of our algorithm is available at www.comp.nus.edu.sg/~bioinfo/LGSFAligner/. Its running time is significantly faster than the original lssa program. © Mary Ann Liebert, Inc.
Sun, 01 Jan 2006 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/403912006-01-01T00:00:00Z
- Constructing a smallest refining galled phylogenetic networkhttps://scholarbank.nus.edu.sg/handle/10635/40451Title: Constructing a smallest refining galled phylogenetic network
Authors: Huynh, T.N.D.; Jansson, J.; Nguyen, N.B.; Sung, W.-K.
Abstract: Reticulation events occur frequently in many types of species. Therefore, to develop accurate methods for reconstructing phylogenetic networks in order to describe evolutionary history in the presence of reticulation events is important. Previous work has suggested that constructing phylogenetic networks by merging gene trees is a biologically meaningful approach. This paper presents two new efficient algorithms for inferring a phylogenetic network from a set T of gene trees of arbitrary degrees. The first algorithm solves the open problem of constructing a refining galled network for T (if one exists) with no restriction on the number of hybrid nodes; in fact, it outputs the smallest possible solution. In comparison, the previously best method (SpNet) can only construct networks having a single hybrid node. For cases where there exists no refining galled network for T, our second algorithm identifies a minimum subset of the species set to be removed so that the resulting trees can be combined into a galled network. Based on our two algorithms, we propose two general methods named RGNet and RGNet+. Through simulations, we show that our methods outperform the other existing methods neighbor-joining, NeighborNet, and SpNet. © Springer-Verlag Berlin Heidelberg 2005.
Sat, 01 Jan 2005 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/404512005-01-01T00:00:00Z
- Computing the Rooted Triplet Distance Between Phylogenetic Networkshttps://scholarbank.nus.edu.sg/handle/10635/157235Title: Computing the Rooted Triplet Distance Between Phylogenetic Networks
Authors: Jansson, Jesper; Mampentzidis, Konstantinos; Rajaby, Ramesh; Sung Wing Kin
Tue, 23 Jul 2019 00:00:00 GMThttps://scholarbank.nus.edu.sg/handle/10635/1572352019-07-23T00:00:00Z