Please use this identifier to cite or link to this item: https://doi.org/10.1002/prot.20605
Title: Prediction of transporter family from protein sequence by support vector machine approach
Authors: Lin, H.H.
Han, L.Y. 
Cai, C.Z. 
Ji, Z.L.
Chen, Y.Z. 
Keywords: Channel
Support vector machine
Transporter
Issue Date: 1-Jan-2006
Citation: Lin, H.H., Han, L.Y., Cai, C.Z., Ji, Z.L., Chen, Y.Z. (2006-01-01). Prediction of transporter family from protein sequence by support vector machine approach. Proteins: Structure, Function and Genetics 62 (1) : 218-231. ScholarBank@NUS Repository. https://doi.org/10.1002/prot.20605
Abstract: Transporters play key roles in cellular transport and metabolic processes, and in facilitating drug delivery and excretion. These proteins are classified into families based on the transporter classification (TC) system. Determination of the TC family of transporters facilitates the study of their cellular and pharmacological functions. Methods for predicting TC family without sequence alignments or clustering are particularly useful for studying novel transporters whose function cannot be determined by sequence similarity. This work explores the use of a machine learning method, support vector machines (SVMs), for predicting the family of transporters from their sequence without the use of sequence similarity. A total of 10,636 transporters in 13 TC subclasses, 1914 transporters in eight TC families, and 168,341 nontransporter proteins are used to train and test the SVM prediction system. Testing results by using a separate set of 4351 transporters and 83,151 nontransporter proteins show that the overall accuracy for predicting members of these TC subclasses and families is 83.4% and 88.0%, respectively, and that of nonmembers is 99.3% and 96.6%, respectively. The accuracies for predicting members and nonmembers of individual TC subclasses are in the range of 70.7-96.1% and 97.6-99.9%, respectively, and those of individual TC families are in the range of 60.6-97.1% and 91.5-99.4%, respectively. A further test by using 26,139 transmembrane proteins outside each of the 13 TC subclasses shows that 90.4-99.6% of these are correctly predicted. Our study suggests that the SVM is potentially useful for facilitating functional study of transporters irrespective of sequence similarity. © 2005 Wiley-Liss, Inc.
Source Title: Proteins: Structure, Function and Genetics
URI: http://scholarbank.nus.edu.sg/handle/10635/106247
ISSN: 08873585
DOI: 10.1002/prot.20605
Appears in Collections:Staff Publications

Show full item record
Files in This Item:
There are no files associated with this item.

Google ScholarTM

Check

Altmetric


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.