Evaluation of virtual screening performance of support vector machines trained by sparsely distributed active compounds

Please use this identifier to cite or link to this item: https://doi.org/10.1021/ci800022e

DC Field	Value
dc.title	Evaluation of virtual screening performance of support vector machines trained by sparsely distributed active compounds
dc.contributor.author	Ma, X.H.
dc.contributor.author	Wang, R.
dc.contributor.author	Yang, S.Y.
dc.contributor.author	Li, Z.R.
dc.contributor.author	Xue, Y.
dc.contributor.author	Wei, Y.C.
dc.contributor.author	Low, B.C.
dc.contributor.author	Chen, Y.Z.
dc.date.accessioned	2014-10-27T08:27:49Z
dc.date.available	2014-10-27T08:27:49Z
dc.date.issued	2008-06
dc.identifier.citation	Ma, X.H., Wang, R., Yang, S.Y., Li, Z.R., Xue, Y., Wei, Y.C., Low, B.C., Chen, Y.Z. (2008-06). Evaluation of virtual screening performance of support vector machines trained by sparsely distributed active compounds. Journal of Chemical Information and Modeling 48 (6) : 1227-1237. ScholarBank@NUS Repository. https://doi.org/10.1021/ci800022e
dc.identifier.issn	15499596
dc.identifier.uri	http://scholarbank.nus.edu.sg/handle/10635/100617
dc.description.abstract	Virtual screening performance of support vector machines (SVM) depends on the diversity of training active and inactive compounds. While diverse inactive compounds can be routinely generated, the number and diversity of known actives are typically low. We evaluated the performance of SVM trained by sparsely distributed actives in six MDDR biological target classes composed of a high number of known actives (983-1645) of high, intermediate, and low structural diversity (muscarinic M1 receptor agonists, NMDA receptor antagonists, thrombin inhibitors, HIV protease inhibitors, cephalosporins, and renin inhibitors). SVM trained by regularly sparse data sets of 100 actives show improved yields at substantially reduced false-hit rates compared to those of published studies and those of Tanimoto-based similarity searching mediod based on the same data sets and molecular descriptors. SVM trained by very sparse data sets of 40 actives (2.4%-4.1% of the known actives) predicted 17.5-39.5%, 23.0-48.1%, and 70.2-92.4% of the remaining 943-1605 actives in die high, intermediate, and low diversity classes, respectively, 13.8-68.7% of which are outside tie training compound families. SVM predicted 99.97% and 97.1% of the 9.997 M PUBCHEM and 167K remaining MDDR compounds as inactive and 2.6%-8.3% of the 19,495-38,483 MDDR compounds similar to the known actives as active. These suggest that SVM has substantial capability in identifying novel active compounds from sparse active data sets at low false-hit rates. © 2008 American Chemical Society.
dc.description.uri	http://libproxy1.nus.edu.sg/login?url=http://dx.doi.org/10.1021/ci800022e
dc.source	Scopus
dc.type	Article
dc.contributor.department	PHARMACY
dc.contributor.department	BIOLOGICAL SCIENCES
dc.description.doi	10.1021/ci800022e
dc.description.sourcetitle	Journal of Chemical Information and Modeling
dc.description.volume	48
dc.description.issue	6
dc.description.page	1227-1237
dc.identifier.isiut	000257026800011
Appears in Collections:	Staff Publications

Show simple item record

Files in This Item:

There are no files associated with this item.

Google Scholar^TM

Check

Files in This Item:

Google ScholarTM

Altmetric

Google Scholar^TM