A support vector machines approach for virtual screening of active compounds of single and multiple mechanisms from large libraries at an improved hit-rate and enrichment factor | ScholarBank@NUS

Please use this identifier to cite or link to this item: https://doi.org/10.1016/j.jmgm.2007.12.002

Title:	A support vector machines approach for virtual screening of active compounds of single and multiple mechanisms from large libraries at an improved hit-rate and enrichment factor
Authors:	Han, L.Y. Ma, X.H. Lin, H.H. Jia, J. Zhu, F. Xue, Y. Li, Z.R. Cao, Z.W. Ji, Z.L. Chen, Y.Z.
Keywords:	Computer aided dug design Drug discovery High-throughput screening Lead discovery Machine learning method Virtual screening
Issue Date:	Jun-2008
Citation:	Han, L.Y., Ma, X.H., Lin, H.H., Jia, J., Zhu, F., Xue, Y., Li, Z.R., Cao, Z.W., Ji, Z.L., Chen, Y.Z. (2008-06). A support vector machines approach for virtual screening of active compounds of single and multiple mechanisms from large libraries at an improved hit-rate and enrichment factor. Journal of Molecular Graphics and Modelling 26 (8) : 1276-1286. ScholarBank@NUS Repository. https://doi.org/10.1016/j.jmgm.2007.12.002
Abstract:	Support vector machines (SVM) and other machine-learning (ML) methods have been explored as ligand-based virtual screening (VS) tools for facilitating lead discovery. While exhibiting good hit selection performance, in screening large compound libraries, these methods tend to produce lower hit-rate than those of the best performing VS tools, partly because their training-sets contain limited spectrum of inactive compounds. We tested whether the performance of SVM can be improved by using training-sets of diverse inactive compounds. In retrospective database screening of active compounds of single mechanism (HIV protease inhibitors, DHFR inhibitors, dopamine antagonists) and multiple mechanisms (CNS active agents) from large libraries of 2.986 million compounds, the yields, hit-rates, and enrichment factors of our SVM models are 52.4-78.0%, 4.7-73.8%, and 214-10,543, respectively, compared to those of 62-95%, 0.65-35%, and 20-1200 by structure-based VS and 55-81%, 0.2-0.7%, and 110-795 by other ligand-based VS tools in screening libraries of ≥1 million compounds. The hit-rates are comparable and the enrichment factors are substantially better than the best results of other VS tools. 24.3-87.6% of the predicted hits are outside the known hit families. SVM appears to be potentially useful for facilitating lead discovery in VS of large compound libraries. © 2007 Elsevier Inc. All rights reserved.
Source Title:	Journal of Molecular Graphics and Modelling
URI:	http://scholarbank.nus.edu.sg/handle/10635/105604
ISSN:	10933263
DOI:	10.1016/j.jmgm.2007.12.002
Appears in Collections:	Staff Publications

Show full item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Altmetric

Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.