Please use this identifier to cite or link to this item:
Title: Enzyme Family Classification by Support Vector Machines
Authors: Cai, C.Z. 
Han, L.Y. 
Ji, Z.L. 
Chen, Y.Z. 
Keywords: Classification
Protein family
Protein function
Protein function prediction
Protein sequence
Support vector machine
Issue Date: 1-Apr-2004
Citation: Cai, C.Z., Han, L.Y., Ji, Z.L., Chen, Y.Z. (2004-04-01). Enzyme Family Classification by Support Vector Machines. Proteins: Structure, Function and Genetics 55 (1) : 66-76. ScholarBank@NUS Repository.
Abstract: One approach for facilitating protein function prediction is to classify proteins into functional families. Recent studies on the classification of G-protein coupled receptors and other proteins suggest that a statistical learning method, Support vector machines (SVM), may be potentially useful for protein classification into functional families. In this work, SVM is applied and tested on the classification of enzymes into functional families defined by the Enzyme Nomenclature Committee of IUBMB. SVM classification system for each family is trained from representative enzymes of that family and seed proteins of Pfam curated protein families. The classification accuracy for enzymes from 46 families and for non-enzymes is in the range of 50.0% to 95.7% and 79.0% to 100% respectively. The corresponding Matthews correlation coefficient is in the range of 54.1% to 96.1%. Moreover, 80.3% of the 8,291 correctly classified enzymes are uniquely classified into a specific enzyme family by using a scoring function, indicating that SVM may have certain level of unique prediction capability. Testing results also suggest that SVM in some cases is capable of classification of distantly related enzymes and homologous enzymes of different functions. Effort is being made to use a more comprehensive set of enzymes as training sets and to incorporate multi-class SVM classification systems to further enhance the unique prediction accuracy. Our results suggest the potential of SVM for enzyme family classification and for facilitating protein function prediction. © 2004 Wiley-Liss, Inc.
Source Title: Proteins: Structure, Function and Genetics
ISSN: 08873585
DOI: 10.1002/prot.20045
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM



Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.