Please use this identifier to cite or link to this item:
Title: Approximate Matching in Genomic Sequence Data
Authors: CAO XIA
Keywords: DNA, Protein, Genomic Sequence Database, Sequence Matching, Similarity Search, Sequence Classification
Issue Date: 14-Jun-2006
Citation: CAO XIA (2006-06-14). Approximate Matching in Genomic Sequence Data. ScholarBank@NUS Repository.
Abstract: Increasing interest in genetic research has resulted in the creation of huge genomic databases and approximate sequence matching in genomic sequence databases has become a basic operation in computational biology. In this thesis, we studied three research problems -- DNA sequence similarity search in sequence database, DNA sequence approximate join, and protein subcellular localization prediction, which are all related to sequence approximate matching in genomic databases. Our experimental results showed that 1)the proposed search model and index structure are very effective in organizing a large genomic sequence database; 2)the proposed novel filtering algorithms are very efficient in processing approximate sequence matching; and 3)the proposed q-gram based feature vectors extracted from protein sequence are helpful in predicting the subcellular localization of protein sequences.
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
thesis2006.pdf856.81 kBAdobe PDF



Page view(s)

checked on Apr 20, 2019


checked on Apr 20, 2019

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.