Machine learning methods in chemoinformatics

Machine learning algorithms are generally developed in computer science or adjacent disciplines and find their way into chemical modeling by a process of diffusion. Though particular machine learning methods are popular in chemoinformatics and quantitative structure-activity relationships (QSAR), many others exist in the technical literature. This discussion is methods-based and focused on some algorithms that chemoinformatics researchers frequently use. It makes no claim to be exhaustive. We concentrate on methods for supervised learning, predicting the unknown property values of a test set of instances, usually molecules, based on the known values for a training set. Particularly relevant approaches include Artificial Neural Networks, Random Forest, Support Vector Machine, k-Nearest Neighbors and naïve Bayes classifiers.

Citation

Mitchell , J B O 2014 , ' Machine learning methods in chemoinformatics ' , Wiley Interdisciplinary Reviews: Computational Molecular Science , vol. 4 , no. 5 , pp. 468–481 . https://doi.org/10.1002/wcms.1183

Publication

Wiley Interdisciplinary Reviews: Computational Molecular Science

Status

Peer reviewed

DOI

10.1002/wcms.1183

ISSN

1759-0876

Type

Journal article

Collections

University of St Andrews Research

URI

https://hdl.handle.net/10023/4511