Please use this identifier to cite or link to this item:
https://doi.org/10.1002/prot.20605
Title: | Prediction of transporter family from protein sequence by support vector machine approach | Authors: | Lin, H.H. Han, L.Y. Cai, C.Z. Ji, Z.L. Chen, Y.Z. |
Keywords: | Channel Support vector machine Transporter |
Issue Date: | 1-Jan-2006 | Citation: | Lin, H.H., Han, L.Y., Cai, C.Z., Ji, Z.L., Chen, Y.Z. (2006-01-01). Prediction of transporter family from protein sequence by support vector machine approach. Proteins: Structure, Function and Genetics 62 (1) : 218-231. ScholarBank@NUS Repository. https://doi.org/10.1002/prot.20605 | Abstract: | Transporters play key roles in cellular transport and metabolic processes, and in facilitating drug delivery and excretion. These proteins are classified into families based on the transporter classification (TC) system. Determination of the TC family of transporters facilitates the study of their cellular and pharmacological functions. Methods for predicting TC family without sequence alignments or clustering are particularly useful for studying novel transporters whose function cannot be determined by sequence similarity. This work explores the use of a machine learning method, support vector machines (SVMs), for predicting the family of transporters from their sequence without the use of sequence similarity. A total of 10,636 transporters in 13 TC subclasses, 1914 transporters in eight TC families, and 168,341 nontransporter proteins are used to train and test the SVM prediction system. Testing results by using a separate set of 4351 transporters and 83,151 nontransporter proteins show that the overall accuracy for predicting members of these TC subclasses and families is 83.4% and 88.0%, respectively, and that of nonmembers is 99.3% and 96.6%, respectively. The accuracies for predicting members and nonmembers of individual TC subclasses are in the range of 70.7-96.1% and 97.6-99.9%, respectively, and those of individual TC families are in the range of 60.6-97.1% and 91.5-99.4%, respectively. A further test by using 26,139 transmembrane proteins outside each of the 13 TC subclasses shows that 90.4-99.6% of these are correctly predicted. Our study suggests that the SVM is potentially useful for facilitating functional study of transporters irrespective of sequence similarity. © 2005 Wiley-Liss, Inc. | Source Title: | Proteins: Structure, Function and Genetics | URI: | http://scholarbank.nus.edu.sg/handle/10635/106247 | ISSN: | 08873585 | DOI: | 10.1002/prot.20605 |
Appears in Collections: | Staff Publications |
Show full item record
Files in This Item:
There are no files associated with this item.
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.