Three machine learning models for the 2019 Solubility Challenge

Mitchell, John B. O.

Show simple item record

Files in this item

Name:: Mitchell_2020_ADMET_Three_CC.pdf
Size:: 1.876Mb
Format:: PDF

View/Open

Item metadata

dc.contributor.author	Mitchell, John B. O.
dc.date.accessioned	2020-10-07T15:30:01Z
dc.date.available	2020-10-07T15:30:01Z
dc.date.issued	2020-09-27
dc.identifier	268401714
dc.identifier	9d1acf52-ebbe-444b-8fbf-50d95fc16269
dc.identifier	000575918900004
dc.identifier	85097227018
dc.identifier.citation	Mitchell , J B O 2020 , ' Three machine learning models for the 2019 Solubility Challenge ' , ADMET & DMPK , vol. 8 , no. 3 , pp. 215-251 . https://doi.org/10.5599/admet.835	en
dc.identifier.issn	1848-7718
dc.identifier.other	ORCID: /0000-0002-0379-6097/work/75996580
dc.identifier.uri	https://hdl.handle.net/10023/20737
dc.description.abstract	We describe three machine learning models submitted to the 2019 Solubility Challenge. All are founded on tree-like classifiers, with one model being based on Random Forest and another on the related Extra Trees algorithm. The third model is a consensus predictor combining the former two with a Bagging classifier. We call this consensus classifier Vox Machinarum, and here discuss how it benefits from the Wisdom of Crowds. On the first 2019 Solubility Challenge test set of 100 low-variance intrinsic aqueous solubilities, Extra Trees is our best classifier. One the other, a high-variance set of 32 molecules, we find that Vox Machinarum and Random Forest both perform a little better than Extra Trees, and almost equally to one another. We also compare the gold standard solubilities from the 2019 Solubility Challenge with a set of literature-based solubilities for most of the same compounds.
dc.format.extent	37
dc.format.extent	1967964
dc.language.iso	eng
dc.relation.ispartof	ADMET & DMPK	en
dc.subject	Aqueous intrinsic solubility	en
dc.subject	Solubility prediction	en
dc.subject	Random forest	en
dc.subject	Extra trees	en
dc.subject	Bagging	en
dc.subject	Consensus classifiers	en
dc.subject	Wisdom of crowds	en
dc.subject	Inter-laboratory error	en
dc.subject	QD Chemistry	en
dc.subject	QA75 Electronic computers. Computer science	en
dc.subject	Chemistry(all)	en
dc.subject	Computer Science(all)	en
dc.subject	3rd-DAS	en
dc.subject.lcc	QD	en
dc.subject.lcc	QA75	en
dc.title	Three machine learning models for the 2019 Solubility Challenge	en
dc.type	Journal article	en
dc.contributor.institution	University of St Andrews. School of Chemistry	en
dc.contributor.institution	University of St Andrews. Biomedical Sciences Research Complex	en
dc.contributor.institution	University of St Andrews. EaSTCHEM	en
dc.identifier.doi	https://doi.org/10.5599/admet.835
dc.description.status	Peer reviewed	en

This item appears in the following Collection(s)

University of St Andrews Research

Show simple item record