St Andrews Research Repository

St Andrews University Home
View Item 
  •   St Andrews Research Repository
  • University of St Andrews Research
  • University of St Andrews Research
  • University of St Andrews Research
  • View Item
  •   St Andrews Research Repository
  • University of St Andrews Research
  • University of St Andrews Research
  • University of St Andrews Research
  • View Item
  •   St Andrews Research Repository
  • University of St Andrews Research
  • University of St Andrews Research
  • University of St Andrews Research
  • View Item
  • Login
JavaScript is disabled for your browser. Some features of this site may not work without it.

Greedy and linear ensembles of machine learning methods outperform single approaches for QSPR regression problems

Thumbnail
View/Open
kew_mitchell_accepted_version.pdf (648.6Kb)
Date
09/2015
Author
Kew, William
Mitchell, John B. O.
Keywords
Machine Learning
Quantitative structure-property relationships
Greedy ensembles
Linear ensembles
QD Chemistry
DAS
Metadata
Show full item record
Altmetrics Handle Statistics
Altmetrics DOI Statistics
Abstract
The application of Machine Learning to cheminformatics is a large and active field of research, but there exist few papers which discuss whether ensembles of different Machine Learning methods can improve upon the performance of their component methodologies. Here we investigated a variety of methods, including kernel-based, tree, linear, neural networks, and both greedy and linear ensemble methods. These were all tested against a standardised methodology for regression with data relevant to the pharmaceutical development process. Thinvestigation focused on QSPR problems within drug-like chemical space. We aimed to investigate which methods perform best, and how the ‘wisdom of crowds’ principle can be applied to ensemble predictors. It was found that no single method performs best for all problems, but that a dynamic, well-structured ensemble predictor would perform very well across the board, usually providing an improvement in performance over the best single method. Its use of weighting factors allows the greedy ensemble to acquire a bigger contribution from the better performing models, and this helps the greedy ensemble generally to outperform the simpler linear ensemble. Choice of data pre-processing methodology was found to be crucial to performance of each method too.
Citation
Kew , W & Mitchell , J B O 2015 , ' Greedy and linear ensembles of machine learning methods outperform single approaches for QSPR regression problems ' , Molecular Informatics , vol. 34 , no. 9 , pp. 634-647 . https://doi.org/10.1002/minf.201400122
Publication
Molecular Informatics
Status
Peer reviewed
DOI
https://doi.org/10.1002/minf.201400122
ISSN
1868-1743
Type
Journal article
Rights
© 2015 WILEY-VCH Verlag GmbH & Co. KGaA, Weinheim. This is the peer reviewed version of the following article: Kew, W. and Mitchell, J. B. O. (2015), Greedy and Linear Ensembles of Machine Learning Methods Outperform Single Approaches for QSPR Regression Problems. Mol. Inf., which has been published in final form at http://onlinelibrary.wiley.com/doi/10.1002/minf.201400122/abstract. This article may be used for non-commercial purposes in accordance With Wiley-VCH Terms and Conditions for self-archiving.
Collections
  • University of St Andrews Research
URL
http://onlinelibrary.wiley.com/doi/10.1002/minf.201400122/suppinfo
URI
http://hdl.handle.net/10023/8484

Items in the St Andrews Research Repository are protected by copyright, with all rights reserved, unless otherwise indicated.

Advanced Search

Browse

All of RepositoryCommunities & CollectionsBy Issue DateNamesTitlesSubjectsClassificationTypeFunderThis CollectionBy Issue DateNamesTitlesSubjectsClassificationTypeFunder

My Account

Login

Open Access

To find out how you can benefit from open access to research, see our library web pages and Open Access blog. For open access help contact: openaccess@st-andrews.ac.uk.

Accessibility

Read our Accessibility statement.

How to submit research papers

The full text of research papers can be submitted to the repository via Pure, the University's research information system. For help see our guide: How to deposit in Pure.

Electronic thesis deposit

Help with deposit.

Repository help

For repository help contact: Digital-Repository@st-andrews.ac.uk.

Give Feedback

Cookie policy

This site may use cookies. Please see Terms and Conditions.

Usage statistics

COUNTER-compliant statistics on downloads from the repository are available from the IRUS-UK Service. Contact us for information.

© University of St Andrews Library

University of St Andrews is a charity registered in Scotland, No SC013532.

  • Facebook
  • Twitter