Please use this identifier to cite or link to this item: http://scholarbank.nus.edu.sg/handle/10635/41583
Title: Splice junction classification problems for DNA sequences: Representation issues
Authors: Sarkar, M. 
Leong, T.-Y. 
Keywords: Classification
DNA
Exon
Gene
Intron
Random walk and Hurst coefficient
Representation
Splice boundary
Issue Date: 2001
Source: Sarkar, M.,Leong, T.-Y. (2001). Splice junction classification problems for DNA sequences: Representation issues. Annual Reports of the Research Reactor Institute, Kyoto University 3 : 2895-2898. ScholarBank@NUS Repository.
Abstract: Splice junction classification in a Eukaryotic cell is an important problem because the splice junction indicates which part of the DNA sequence carries protein-coding information. The major issue in building a classifier for this classification task is how to represent the DNA sequence on computers since the accuracy of any classification technique critically hinges on the adopted representation. This paper presents the experimental results on seven representation schemes. The first three representations interpret each DNA sequence as a series of symbols. The fourth and fifth representations consider the sequence as a series of real numbers. Moreover, the first, second and fourth representations do not consider the influence of the neighbors on the occurrence of a nucleotide, whereas the third and fifth representations take the influence of the neighbors into considerations. To capture certain regularity in the apparent randomness in the DNA sequence, the sixth representation treats the sequence as a variant of random walk. The seventh representation uses Hurst coefficient, which quantifies the roughness of the DNA sequences. The experimental results suggest that the fourth representation scheme makes sequences from the same class close and the sequences from the different classes far, and thus finds a structure in the input space to provide the best classification results.
Source Title: Annual Reports of the Research Reactor Institute, Kyoto University
URI: http://scholarbank.nus.edu.sg/handle/10635/41583
ISSN: 04549244
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Page view(s)

26
checked on Dec 9, 2017

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.