Show simple item record

Files in this item


Item metadata

dc.contributor.authorCannon, EO
dc.contributor.authorNigsch, F
dc.contributor.authorMitchell, John Blayney Owen
dc.identifier.citationCannon , EO , Nigsch , F & Mitchell , J B O 2008 , ' A novel hybrid ultrafast shape descriptor method for use in virtual screening ' , Chemistry Central Journal , vol. 2 , 3 .
dc.identifier.otherPURE: 454950
dc.identifier.otherPURE UUID: 3149da00-e88a-4fcf-9a04-33523d6571f1
dc.identifier.otherstandrews_research_output: 30977
dc.identifier.otherScopus: 43349090139
dc.identifier.otherORCID: /0000-0002-0379-6097/work/34033411
dc.descriptionThe authors thank the EPSRC and Unilever plc for funding.en
dc.description.abstractBackground We have introduced a new Hybrid descriptor composed of the MACCS key descriptor encoding topological information and Ballester and Richards' Ultrafast Shape Recognition (USR) descriptor. The latter one is calculated from the moments of the distribution of the interatomic distances, and in this work we also included higher moments than in the original implementation. Results The performance of this Hybrid descriptor is assessed using Random Forest and a dataset of 116,476 molecules. Our dataset includes 5,245 molecules in ten classes from the 2005 World Anti-Doping Agency (WADA) dataset and 111,231 molecules from the National Cancer Institute (NCI) database. In a 10-fold Monte Carlo cross-validation this dataset was partitioned into three distinct parts for training, optimisation of an internal threshold that we introduced, and validation of the resulting model. The standard errors obtained were used to assess statistical significance of observed improvements in performance of our new descriptor. Conclusion The Hybrid descriptor was compared to the MACCS key descriptor, USR with the first three (USR), four (UF4) and five (UF5) moments, and a combination of MACCS with USR (three moments). The MACCS key descriptor was not combined with UF5, due to similar performance of UF5 and UF4. Superior performance in terms of all figures of merit was found for the MACCS/UF4 Hybrid descriptor with respect to all other descriptors examined. These figures of merit include recall in the top 1% and top 5% of the ranked validation sets, precision, F-measure, area under the Receiver Operating Characteristic curve and Matthews Correlation Coefficient.
dc.relation.ispartofChemistry Central Journalen
dc.rights© 2008 Cannon et al This is an Open Access article distributed under the terms of the Creative Commons Attribution License (, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.en
dc.subjectQD Chemistryen
dc.subjectSDG 3 - Good Health and Well-beingen
dc.titleA novel hybrid ultrafast shape descriptor method for use in virtual screeningen
dc.typeJournal articleen
dc.description.versionPublisher PDFen
dc.contributor.institutionUniversity of St Andrews. School of Chemistryen
dc.contributor.institutionUniversity of St Andrews. Biomedical Sciences Research Complexen
dc.contributor.institutionUniversity of St Andrews. EaSTCHEMen
dc.description.statusPeer revieweden

This item appears in the following Collection(s)

Show simple item record