Please use this identifier to cite or link to this item:
|Title:||Generalized correlation functions and their applications in selection of optimal multiple spaced seeds for homology search|
Goulden-Jackson cluster method
|Source:||Kong, Y. (2007-03). Generalized correlation functions and their applications in selection of optimal multiple spaced seeds for homology search. Journal of Computational Biology 14 (2) : 238-254. ScholarBank@NUS Repository. https://doi.org/10.1089/cmb.2006.0008|
|Abstract:||The Goulden-Jackson cluster method is a powerful method to calculate the probability of occurrences of a pattern or set of patterns in a sequence. If the patterns contain wildcard characters, however, the size of the connector matrix grows exponentially with the number of wildcards. Here we show that average correlation c̄(z) is a good predicator of hitting probability q n, and the generalized correlation function ĉ(z) can be used to approximate c̄(z) efficiently. We apply the method to the problem of optimal multiple spaced seed selection for homology search. We reexamine the concept of optimal sensitivity of spaced seeds and show that it is better to select optimal seeds based on some average properties, such as c̄(1), which is the expectation of the first hitting length. Higher order approximations can also be constructed easily. Tests on arbitrary large genomic data with multiple seeds show that the optimal multiple seeds selected by the methods are indeed more sensitive. The methods provide a theoretical background on which various empirical observations can be unified and further heuristic search methods can be developed. © Mary Ann Liebert, Inc.|
|Source Title:||Journal of Computational Biology|
|Appears in Collections:||Staff Publications|
Show full item record
Files in This Item:
There are no files associated with this item.
checked on Feb 15, 2018
WEB OF SCIENCETM
checked on Feb 6, 2018
checked on Feb 20, 2018
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.