Identification of small molecule aggregators from large compound libraries by support vector machines | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://doi.org/10.1002/jcc.21347

Title:	Identification of small molecule aggregators from large compound libraries by support vector machines
Authors:	Rao, H. Li, Z. Li, X. Ma, X. Ung, C. Li, H. Liu, X. Chen, Y.
Keywords:	Active compound Aggregation Aggregator Drug discovery High throughput screening Machine learning method Recursive feature elimination Support vector machine Virtual screening
Issue Date:	Mar-2010
Citation:	Rao, H., Li, Z., Li, X., Ma, X., Ung, C., Li, H., Liu, X., Chen, Y. (2010-03). Identification of small molecule aggregators from large compound libraries by support vector machines. Journal of Computational Chemistry 31 (4) : 752-763. ScholarBank@NUS Repository. https://doi.org/10.1002/jcc.21347
Abstract:	Small molecule aggregators non-specifically inhibit multiple unrelated proteins, rendering them therapeutically useless. They frequently appear as false hits and thus need to be eliminated in high-throughput screening campaigns. Computational methods have been explored for identifying aggregators, which have not been tested in screening large compound libraries. We used 1319 aggregators and 128,325 non-aggregators to develop a support vector machines (SVM) aggregator identification model, which was tested by four methods. The first is five fold cross-validation, which showed comparable aggregator and significantly improved non-aggregator identification rates against earlier studies. The second is the independent test of .17 aggregators discovered independently from the training aggregators, 71% of which were correctly identified. The third is retrospective screening of 13M PUBCHEM and 168K MDDR. compounds, which predicted 97.9% and 98.7% of the PUBCHEM and MDDR compounds as non-aggregators. The fourth is retrospective screening of 5527 MDDR compounds similar to the known aggregators, 1,14% of which were predicted as aggregators. SVM showed slightly better overall performance against two other machine learning methods based on five fold cross-validation studies of the same settings. Molecular features of aggregation, extracted by a feature selection method, are consistent with published profiles. SVM showed substantial capability in identifying aggregators from large libraries at low false-hit rates. © 2009 Wiley Periodicals, Inc.
Source Title:	Journal of Computational Chemistry
URI:	http://scholarbank.nus.edu.sg/handle/10635/100877
ISSN:	01928651
DOI:	10.1002/jcc.21347
Appears in Collections:	Staff Publications

Show full item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Altmetric

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.