Show simple item record

Files in this item

Thumbnail

Item metadata

dc.contributor.authorMitchell, John B. O.
dc.date.accessioned2020-10-07T15:30:01Z
dc.date.available2020-10-07T15:30:01Z
dc.date.issued2020-09-27
dc.identifier.citationMitchell , J B O 2020 , ' Three machine learning models for the 2019 Solubility Challenge ' , ADMET & DMPK , vol. 8 , no. 3 , pp. 215-251 . https://doi.org/10.5599/admet.835en
dc.identifier.issn1848-7718
dc.identifier.otherPURE: 268401714
dc.identifier.otherPURE UUID: 9d1acf52-ebbe-444b-8fbf-50d95fc16269
dc.identifier.otherORCID: /0000-0002-0379-6097/work/75996580
dc.identifier.otherWOS: 000575918900004
dc.identifier.otherScopus: 85097227018
dc.identifier.urihttps://hdl.handle.net/10023/20737
dc.description.abstractWe describe three machine learning models submitted to the 2019 Solubility Challenge. All are founded on tree-like classifiers, with one model being based on Random Forest and another on the related Extra Trees algorithm. The third model is a consensus predictor combining the former two with a Bagging classifier. We call this consensus classifier Vox Machinarum, and here discuss how it benefits from the Wisdom of Crowds. On the first 2019 Solubility Challenge test set of 100 low-variance intrinsic aqueous solubilities, Extra Trees is our best classifier. One the other, a high-variance set of 32 molecules, we find that Vox Machinarum and Random Forest both perform a little better than Extra Trees, and almost equally to one another. We also compare the gold standard solubilities from the 2019 Solubility Challenge with a set of literature-based solubilities for most of the same compounds.
dc.format.extent37
dc.language.isoeng
dc.relation.ispartofADMET & DMPKen
dc.rightsCopyright © 2020 The Author(s). Open Access. Articles are published under the terms and conditions of the Creative Commons Attribution license 4.0 International.en
dc.subjectAqueous intrinsic solubilityen
dc.subjectSolubility predictionen
dc.subjectRandom foresten
dc.subjectExtra treesen
dc.subjectBaggingen
dc.subjectConsensus classifiersen
dc.subjectWisdom of crowdsen
dc.subjectInter-laboratory erroren
dc.subjectQD Chemistryen
dc.subjectQA75 Electronic computers. Computer scienceen
dc.subjectChemistry(all)en
dc.subjectComputer Science(all)en
dc.subject3rd-DASen
dc.subject.lccQDen
dc.subject.lccQA75en
dc.titleThree machine learning models for the 2019 Solubility Challengeen
dc.typeJournal articleen
dc.description.versionPublisher PDFen
dc.contributor.institutionUniversity of St Andrews. School of Chemistryen
dc.contributor.institutionUniversity of St Andrews. Biomedical Sciences Research Complexen
dc.contributor.institutionUniversity of St Andrews. EaSTCHEMen
dc.identifier.doihttps://doi.org/10.5599/admet.835
dc.description.statusPeer revieweden


This item appears in the following Collection(s)

Show simple item record