Files in this item
Three machine learning models for the 2019 Solubility Challenge
Item metadata
dc.contributor.author | Mitchell, John B. O. | |
dc.date.accessioned | 2020-10-07T15:30:01Z | |
dc.date.available | 2020-10-07T15:30:01Z | |
dc.date.issued | 2020-09-27 | |
dc.identifier.citation | Mitchell , J B O 2020 , ' Three machine learning models for the 2019 Solubility Challenge ' , ADMET & DMPK , vol. 8 , no. 3 , pp. 215-251 . https://doi.org/10.5599/admet.835 | en |
dc.identifier.issn | 1848-7718 | |
dc.identifier.other | PURE: 268401714 | |
dc.identifier.other | PURE UUID: 9d1acf52-ebbe-444b-8fbf-50d95fc16269 | |
dc.identifier.other | ORCID: /0000-0002-0379-6097/work/75996580 | |
dc.identifier.other | WOS: 000575918900004 | |
dc.identifier.other | Scopus: 85097227018 | |
dc.identifier.uri | https://hdl.handle.net/10023/20737 | |
dc.description.abstract | We describe three machine learning models submitted to the 2019 Solubility Challenge. All are founded on tree-like classifiers, with one model being based on Random Forest and another on the related Extra Trees algorithm. The third model is a consensus predictor combining the former two with a Bagging classifier. We call this consensus classifier Vox Machinarum, and here discuss how it benefits from the Wisdom of Crowds. On the first 2019 Solubility Challenge test set of 100 low-variance intrinsic aqueous solubilities, Extra Trees is our best classifier. One the other, a high-variance set of 32 molecules, we find that Vox Machinarum and Random Forest both perform a little better than Extra Trees, and almost equally to one another. We also compare the gold standard solubilities from the 2019 Solubility Challenge with a set of literature-based solubilities for most of the same compounds. | |
dc.format.extent | 37 | |
dc.language.iso | eng | |
dc.relation.ispartof | ADMET & DMPK | en |
dc.rights | Copyright © 2020 The Author(s). Open Access. Articles are published under the terms and conditions of the Creative Commons Attribution license 4.0 International. | en |
dc.subject | Aqueous intrinsic solubility | en |
dc.subject | Solubility prediction | en |
dc.subject | Random forest | en |
dc.subject | Extra trees | en |
dc.subject | Bagging | en |
dc.subject | Consensus classifiers | en |
dc.subject | Wisdom of crowds | en |
dc.subject | Inter-laboratory error | en |
dc.subject | QD Chemistry | en |
dc.subject | QA75 Electronic computers. Computer science | en |
dc.subject | Chemistry(all) | en |
dc.subject | Computer Science(all) | en |
dc.subject | 3rd-DAS | en |
dc.subject.lcc | QD | en |
dc.subject.lcc | QA75 | en |
dc.title | Three machine learning models for the 2019 Solubility Challenge | en |
dc.type | Journal article | en |
dc.description.version | Publisher PDF | en |
dc.contributor.institution | University of St Andrews. School of Chemistry | en |
dc.contributor.institution | University of St Andrews. Biomedical Sciences Research Complex | en |
dc.contributor.institution | University of St Andrews. EaSTCHEM | en |
dc.identifier.doi | https://doi.org/10.5599/admet.835 | |
dc.description.status | Peer reviewed | en |
This item appears in the following Collection(s)
Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.