Please use this identifier to cite or link to this item: https://scholarbank.nus.edu.sg/handle/10635/39186
DC FieldValue
dc.titleUsing feature generation and feature selection for accurate prediction of translation initiation sites.
dc.contributor.authorZeng, F.
dc.contributor.authorYap, R.H.
dc.contributor.authorWong, L.
dc.date.accessioned2013-07-04T07:35:55Z
dc.date.available2013-07-04T07:35:55Z
dc.date.issued2002
dc.identifier.citationZeng, F.,Yap, R.H.,Wong, L. (2002). Using feature generation and feature selection for accurate prediction of translation initiation sites.. Genome informatics series : proceedings of the . Workshop on Genome Informatics. Workshop on Genome Informatics 13 : 192-200. ScholarBank@NUS Repository.
dc.identifier.issn09199454
dc.identifier.urihttp://scholarbank.nus.edu.sg/handle/10635/39186
dc.description.abstractCorrect prediction of the translation initiation site (TIS) is an important issue in genomic research. We show that feature generation together with correlation based feature selection can be used with a variety of machine learning algorithms to give highly accurate translation initiation site prediction. Only very few features are needed and the results achieve comparable accuracy to the best existing approaches. Our approach has the advantage that it does not require one to devise a special prediction method; rather standard machine learning classifiers are shown to give very good performance on the selected features. The raw and generated features which we have found to be important are the following: positions -3 and -1 in the sequence; upstream k-grams for k=3, 4, and 5; stop-codon frequency; downstream in-frame 3-gram; and the distance of ATG to the beginning of the sequence. The best result, with an overall accuracy of 90%, is obtained by selecting only seven features from this set. The same features retrained with the use of a scanning model achieves an overall accuracy of 94% on this dataset.
dc.sourceScopus
dc.typeArticle
dc.contributor.departmentCOMPUTER SCIENCE
dc.description.sourcetitleGenome informatics series : proceedings of the . Workshop on Genome Informatics. Workshop on Genome Informatics
dc.description.volume13
dc.description.page192-200
dc.identifier.isiutNOT_IN_WOS
Appears in Collections:Staff Publications

Show simple item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.