Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/15482
Title: Approximate Matching in Genomic Sequence Data
Authors: CAO XIA
Keywords: DNA, Protein, Genomic Sequence Database, Sequence Matching, Similarity Search, Sequence Classification
Issue Date: 14-Jun-2006
Source: CAO XIA (2006-06-14). Approximate Matching in Genomic Sequence Data. ScholarBank@NUS Repository.
Abstract: Increasing interest in genetic research has resulted in the creation of huge genomic databases and approximate sequence matching in genomic sequence databases has become a basic operation in computational biology. In this thesis, we studied three research problems -- DNA sequence similarity search in sequence database, DNA sequence approximate join, and protein subcellular localization prediction, which are all related to sequence approximate matching in genomic databases. Our experimental results showed that 1)the proposed search model and index structure are very effective in organizing a large genomic sequence database; 2)the proposed novel filtering algorithms are very efficient in processing approximate sequence matching; and 3)the proposed q-gram based feature vectors extracted from protein sequence are helpful in predicting the subcellular localization of protein sequences.
URI: http://scholarbank.nus.edu.sg/handle/10635/15482
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
thesis2006.pdf856.81 kBAdobe PDF

OPEN

NoneView/Download

Page view(s)

213
checked on Dec 11, 2017

Download(s)

173
checked on Dec 11, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.